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PREFACE 


1.  Introduction 

1.1.  Background:  The  theory  of  semi-martingales  is  a  major  part  of  the 
general  theory  of  stochastic  processes.  This  theory  has  undergone  massive 
growth  during  the  last  two  decades.  Much  of  the  impetus  for  the  rapid  advances 
in  this  branch  of  pure  mathematics  comes  from  efforts  to  solve  applied  problems. 
For  example,  the  theory  of  stochastic  integration  relative  to  semi-martingales  is 
the  right  tool  for  the  analysis  of  stochastic  dynamical  systems  and  so  for  a  large 
class  of  studies  carried  out  by  theoretical  physicists,  electronic  engineers,  system 
and  control  theorists,  probabilists  and  statisticians.  A  semi-martingale  is  in  fact 
a  general  model  of  the  engineer’s  “signal  plus  noise”  and  the  statistician’s  “trend 
plus  random  fluctuations”. 

1.2.  History:  Following  the  work  of  Paul  Levy  and  Joseph  Doob,  the  epoch 
making  works  in  the  general  theory  of  stochastic  processes  are  due  to  Paul  Andre 
Meyer  [1967],  all  his  papers  in  the  18  or  so  Strasbourg  seminars  in  probability 
(especially,  No.  10),  Kunita  and  Watanabe  [1967],  Meyer  [1973],  Dellacherie 
[1972],  Dellacherie  and  Meyer  [1975,  1980]  and  Jacod  [1979].  For  anyone 
interested  in  reading  into  the  last  two  decade’s  progress  in  the  theory  of  semi¬ 
martingales,  however,  it  must  be  understood  that  the  principal  original  source  is 
the  collection  of  Strasbourg  Seminaires.  These  seminars  are  published  by 
Springer-Verlag  in  the  Lecture  Notes  in  Mathematics  Series.  The  Universite  de 
Strasbourg  Seminaire  de  Probability  not  only  contain  the  modern  theory  of 
semi-martingales,  but  also  retain  the  false  starts,  the  subsequent  alterations  to 
the  “correct”  directions,  the  seemingly  interesting  and  possibly  uninteresting 
concepts  and  the  “ripening”  of  proofs  and  techniques  that  are  characteristic  of 
any  developing  mathematical  theory.  For  example,  see  the  Universite  de 
Strasbourg  Seminaire  de  Probability  in  1967,  1970,  1975,  and  1980  for  successive 
accounts  of  stochastic  integration;  the  first  three  were  given  by  Meyer.  It  is  a  rare 
thing  to  be  able  to  observe  the  evolution  of  a  new  theory  and  to  see  it  mature  in 
such  a  short  period  of  time.  Most  of  the  credit  for  this  rapid  development 
probably  belongs  to  a  group  of  predominately  French  mathematicians  led  by 
Paul-Andre  Meyer  and  loosely  referred  to  as  the  “Strasbourg  School  ". 

Metivier  [1982]  gives  a  slightly  different  emphasis  to  the  subject  than  the  works 
of  those  previously  mentioned.  He  starts  his  work  with  quasi-martingales  and 
bases  the  entire  subject  from  martingales  to  the  stochastic  integration  of  semi- 


martingales  on  the  so-called  Doleans  measure.  This  is  a  very  elegant 
development.  The  use  of  the  Doleans  measure,  to  some  degree,  brings  stochastic 
integration  within  the  domain  of  classical  measure  theory,  a  fact  that  will  please 
a  large  number  of  mathematicians.  At  the  same  time,  Metivier’s  approach 
retains  the  stopping  time  flavor  of  the  Strasbourg  School.  The  additional 
distinctive  feature  of  this  excellent  work  is  that  most  results  are  formulated  for 
Banach  valued  processes,  thus  providing  a  theory  applicable  to  multi-dimensional 
processes. 

An  additional  book,  in  the  spirit  of  Metivier,  has  recently  been  published.  Kai  Li 
Chung  [1983],  together  with  Ruth  Williams,  has  written  a  clear  and  concise  work 
on  stochastic  integration.  Since  it  is  anchored  in  all  of  Chung’s  other  works  and 
those  of  J.  Doob  it  is  worth  reading.  The  only  shortcoming  from  the  standpoint 
of  this  note  is  that  it  only  considers  martingales  with  continuous  paths. 
Gopinath  Kallianpur’s  1980  work  on  stochastic  filtering  theory  also  skips  the 
point  process  case.  But  it  is  worth  reading,  if  only  to  appreciate  the  maturity  of 
the  continuous  parameter  filtering  problem  and  the  clarity  of  Kallianpur’s  style. 

The  principal  study  of  point  processes  front  the  standpoint  of  martingales  is 
Point  Processes  and  Queues’  by  P.  Bremaud,  1981.  This  is  an  excellent  treatise 
on  the  theory  of  martingales  applied  to  queuing  and  the  filtering  problem  for 
point  processes.  Bremaud  develops  his  theory  from  first  principles,  relying  on 
Dellacherie’s  Dual  Previsible  Projection  Theorem  rather  than  the  Doob-Mever 
Decomposition  theorem  and  the  extensive  recent  developments  in  stochastic 
integration  relative  to  semi-martingales.  It  is  the  best  introduction  to  the  subject 
from  the  standpoint  of  applications  and  much  of  what  will  follow  in  this  note 
concerning  filtering  is  borrowed,  in  one  way  or  another,  from  the  ground-breaking 
work  of  Bremaud  since  1972. 

Outside  of  some  examples  illustrating  the  methodology,  a  few  simple  results  in 
Chapter  l  and  a  personal  viewpoint,  all  of  the  mathematics  in  this  note  is  known. 
The  opening  Chapter  introduces  a  discrete  parameter  version  of  the  martingale 
calculus  that  will  be  introduced  in  the  remaining  five  chapters.  The  purpose  of 
this  Chapter,  and  its  threads  into  the  later  sections  where  the  continuous 
parameter  model  is  studied,  is  to  provide  some  intuition  and  background  for  the 
study  of  these  technically  difficult  subjects.  Starting  from  first  principles,  many 
of  the  hard  to  reach  concepts  of  the  continuous  time  model  are  almost  trivial  in 
the  discrete  model;  certainly,  the  proofs  and  technical  details  are  elementary.  The 
case  of  discrete  parameter  point  processes  are  of  particular  interest  (Section  1.10). 
One  can  only  wonder  why  this  material  is  not  written  down  somewhere.  In  most 
instances  results  about  such  processes  follow  from  the  general  theory  in  a 


relatively  straight  forward  manner  (e.g.,  Section  4.7  of  Chapter  4),  but  that  does 
not  replace  the  insight  obtained  from  deriving  these  results  directly.  Moreover, 
having  to  go  to  the  general  theory  of  marked  point  processes  in  order  to  solve  an 
applied  problem  involving  discrete  parameter  point  processes  seems  a  bit 
excessive  and  would  certainly  inhibit  applications  of  the  basic  concepts  of  the 
theory. 

1.2.1.  Contents:  The  six  chapters  contain  foundation  material  on  stopping 
times,  filt rations,  various  types  of  function  measurability,  martingales,  and  a  brief 
description  of  integration  relative  to  martingales.  These  chapters  are  meant  to 
constitute  a  brief  survey  and  introduction  to  this  material.  Therefore,  proofs  are 
given  only  when  they  pass  loose  criteria  based  on  brevity,  insight  and  simplicity. 
Chapter  1  contains  a  brief  introduction  to  nonlinear  filtering. 

’’here  are  two  excellent  surveys  on  martingales  and  stochastic  integration.  One. 
by  C.  Dellacherie  [1978],  concentrates  on  stochastic  integration.  The  other,  due  to 
A.N.  Shiryayev,  is  very  broad.  Both  of  these  papers  are  true  surveys  in  that  they 
tell  what  has  been  accomplished  in  these  areas  and  appropriately  assume  that  the 
reader  has  some  understanding  of  the  area,  especially  probability  theory  and 
stochastic  processes  and  is  an  active  mathematician.  The  present  note,  on  the 
other  hand,  is  meant  to  be  both  a  survey  of  recent  developments  in  this  area  and 
an  introduction  to  the  basic  theory.  As  such,  definitions  observe  the 
mathematical  traditions  of  such  things,  examples  and  counter  examples  are 
supplied  to  aid  in  the  understanding  of  new  objects  defined,  and  Theorems, 
Corollaries  and  Lemmas  are  rigorously  stated.  But  complete  proofs,  sketchs  or 
indications  of  proofs  are  given  only  when  they  are  relatively  easy  and 
informative,  or  when  they  illustrate  the  meaning  of  newly  defined  concepts. 
Chapter  6  is  the  chapter  with  the  most  proofs  simply  because  it  is  impossible  to 
have  any  kind  of  understanding  of  the  stochastic  integral  without  them.  One  of 
the  reasons  this  is  true  is  that  most  readers  of  this  note  will  have  a  strong 
intuition  built  on  classical  theories  of  integration  and  this  knowledge,  combined 
with  the  fact  that  notationally  most  integrals  look  alike  and  have  similar 
properties,  will  mislead  rather  than  support  their  understanding  of  the  stochastic 
integral. 

1.2.2.  Purpose:  The  primary  purpose  of  the  note  is  twofold:  (i)  To  summarize  a 
recently  evolved  theory  and  indicate  how  it  might  be  applied  to  some  BRL  tasks; 
(ii)  To  form  a  foundational  document,  a  common  ground  for  an  interdisciplinary 
group  within  the  C'SM  branch  of  BRL-SECAD,  all  of  whom  are  concerned  with 
various  mathematical  aspects  of  stochastic  network  problems  in  Army 
Communication,  Command  and  Control. 
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SURVEY  AND  INTRODUCTION  TO  STOPPING  TIMES, 


MARTINGALES  AND  STOCHASTIC  INTEGRATION 


Chapter  I.  A  Discrete  Time  Model  Of  Martingale  Calculus 

1.1.  Introduction:  This  first  section  is  meant  to  be  a  discrete  time  model  for 
most  of  the  of  the  topics  in  this  note.  The  initial  purpose  of  this  section,  though, 
was  only  to  guide  the  reader  (  and  writer  )  through  the  intricacies  of  stochastic 
integration  by  first  studying  martingale  transforms  (  stochastic  integrals  for 
discrete  time  processes  ).  This  led  to  introducing  the  quadratic  variation  and 
variance  processes,  and  the  original  Doob  decomposition  of  submartingales. 
Before  long  it  was  clear  that  most  of  the  subsequent  topics  w’ould  be  much  easier 
to  discuss,  in  the  sometimes  sketchy  manner  appropriate  to  a  survey  and 
introduction,  if  one  could  lean  on  an  intuition  built  on  the  sequences  of  random 
variables.  Thus  the  present  form  of  this  section  became  an  attempt  to  provide 
such  an  intuition  or  background  before  launching  off  into  the  much  more 
sophisticated  concepts  required  by  processes  indexed  by  a  continuum. 

This  Chapter  is  not  meant  to  be  a  summary  of  the  theory  of  martingale 
sequences.  This  subject  is  huge.  For  an  almost  flawless  treatment  of  this  theory 
one  would  surely  read  Neveu's  book  [1975]  or  Meyer’s  [1973]  Springer-Verlag 
monograph.  For  a  treatment  of  martingale  sequences  that  has  a  large  number  of 
examples  and  gives  a  very  readable  account  of  the  theory,  one  should  see  Karlin 
and  Taylor  (1975).  Rather,  this  Chapter  is  an  attempt  to  give  a  brief  description 
of  a  "discrete  time  martingale  calculus",  applicable  to  the  study  of  discrete 
(stochastic)  dynamical  systems  (Section  1.10). 

It  may  also  prove  useful  to  see  how  a  few  of  the  concepts  introduced  here  must 
be  modified  when  "time”  becomes  non-denumerable,  most  notably  the  concept  of 
"previsibility ”.  It  took  many  years  for  the  role  of  such  processes  to  be 
understood.  In  Meyer’s  1967  book,  he  talks  about  '‘natural’’  processes  instead  of 
previsible  ones.  The  connection  between  the  two  was  made  in  an  elegant  paper 
by  K.M.  Rao  (1969),  but  again  only  the  Strasbourg  Seminars  (Meyer  (1970))  show 
how  the  importance  of  the  concept  emerged.  By  the  time  one  reads  Dellacherie 
and  Meyer  (1980),  previsible  processes  are  referred  to  as  the  "Borel "  functions  of 
the  general  theory  of  stochastic  processes. 


1 


1.2.  Filtrations  and  Stopping  Times:  Let  Z  be  the  set  of  non-negative 
integers  and  let  (fl,H,P)  denote  a  probability  space,  where  H  is  a  cr-algebra  of 
subsets  of  f!  and  P  is  a  probability  measure  on  H.  A  sequence.  G  =  (Gn.  iuZ).  of 
sub  cr-algebras  of  H  is  called  a  filtration,  if  (relative  to  set  inclusion)  the  G„  are 
nondecreasing  functions  of  n.  We  will  assume  that  G0  is  complete,  in  the  sense 
that  it  contains  all  subsets  of  events  ( i. e. ,  members  of  H)  which  are  assigned 
probability  zero  by  P.  GM  will  denote  the  smallest  cr- algebra  containing  all  the 

Gn:  GX1  =  a  (  Gk  ).  (G^  is  a  sub-cr-algebra  of  H.) 
k>0 

Perhaps  the  single  most  important  concept  in  martingale  theory  is  the  notion  of 
stopping  time.  This  is  more  evident  in  the  continuous  case  than  here.  But  even 
here  where  all  we  are  trying  to  do  is  lay  a  foundation  of  sorts  for  things  to  come, 
this  notion  plays  a  fundamental  role.  Stopping  times  are  defined  relative  to 
filtrations,  so  to  motivate  the  definition  and  at  the  same  time  give  a  concrete 
example  of  a  filtration,  we  first  consider  the  following 

Example:  Let  X  =  (Xn,  n  =  0,1,2,...)  be  a  sequence  of  random  variables 
executing  a  symmetric  random  walk  on  the  real  line,  starting  from  the  origin  ( 
hence,  X0  =  0  ).  For  definiteness,  suppose  that  Xn  represents  the  value  of  a 
game  at  its  nth  trial  and  Xn  =  Xn_j  +  I,  and  Xn  =  Xn_,  -  1,  each  with 

probability  ~  .  Let  G0  =  {  0,0  }  and  Gn  denote  the  smallest  cr-algebra 

generated  by  the  Xk,  0  <  k  <  n  :  Gn:=cr(Xk,0<k<n).  Gj  is  the  family 
consisting  of  the  empty  set  and  unions  of  the  partition  {  w  :  X,(w)  =  1  J. 
{  w  :  Xt(w)  =  -1  };  Go  is  the  family  consisting  of  the  empty  set  and  unions  of  the 
partition 

{  w  :  X^w)  =  -1,  and  X2(w)  =  -2  }, 

{  w  :  Xj(w)  =  -  I,  and  X2(w)  =  0  } , 

{  w  :  X,(w)  =  1,  and  X2(w)  =  0  }, 

{  w  :  X,(w)  =  1,  and  X2(w)  =  2  }. 

Notice  that  the  union  of  the  first  two  of  these  events  and  then  the  union  of  the1 
second  two  give  the  events  that  make  up  the  partition  defining  G,.  Thus,  we 
have  that  Gq  G  Gj  C  Go.  The  remaining  Gk  are  defined  in  a  similar  fashion  and 
monotonicity  continues  to  hold.  G  =  (Gn)  is  therefore  a  filtration.  This  is  an 
example  of  a  special  type  of  filtration  called  variously  the  natural  filtration  or 
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the  internal  history  of  the  processes  X,  or  the  filtration  generated  by  X. 

Now,  for  each  wffi,  let  T(w)  :=  min{  k  :  |Xk(w)|  =  2  },  if  {...}  is  not  empty  and 
T(w):=oo  ,  if  {  •  ■  }=<p.  T  is  the  first  timr  „hat  the  process,  X,  takes  on  the 
value  plus  or  minus  2.  T  is  a  mapping  of  fi  nto  the  extended,  nonnegative,  real 
line,  R+:  =  [0,oo],  with  the  property  that  the  event  [T<n]:={w:T(w)<n}  is  a 
member  of  Gn.  To  see  this  it  is  enough  to  look  at  a  couple  of  cases;  the  formal 
induction  will  be  clear.  Explicitly,  [T<O]=[T<1]=0,  so  these  two  events  are 
contained  in  G0  and  Gt.  Since  [T<2j=pC1=-l,Xo=-2]^jpC1=l,Xo=2],  we  have 
[T<2]£Go.  Viewing  the  family  of  events,  Gn  ,  as  the  history  of  the  process  up  to 
“time”  n,  this  means  that  the  value  of  T  at  time  n  depends  only  on  history  of 
the  process,  X,  up  to  and  including  time  n.  In  this  sense,  the  extended  valued 
random  variable  T  is  said  to  be  a  ‘stopping  time'  relative  to  the  filtration  ( 
history  )  G.  Contrast  this  with  the  variable,  S,  defined  by  setting 
S(w):=max{k:l<k<5,  |  Xk(w)  |  =2},  if  such  a  k  exists  and  oo  otherwise. 
Clearly,  the  values  of  S  depend  on  the  entire  history  of  the  paths,  n  — ►  Xn(w),  of 
the  process  X.  Therefore.  S  is  not  a  stopping  time  relative  to  the  filtration  G. 
according  to  the  follow;ng 

1.2.1.  Definition:  A  mapping  T  from  fi  to  Z  :=  Z  {oo}  is  said  to  be  a  G- 

stopping  time  (optional  time)  if 

{  w  |  T(w)  =  n  }  e  Gn 

for  all  n  in  Z.  When  T  is  a  G-stopping  time,  the  er-algebra,  GT,  of  events  that 
occur  prior  to  T,  is  defined  by  setting 

Gt  =  {  B  £  G^  |  Bp|[T  =  n]fGn  for  all  n  in  Z  }. 

By  definition,  [T=nj  is  in  G^  for  all  n  in  Z,  so  that  [  T  =  oo  ],  the  complement 
of  all  these  events,  is  also  an  event  in  G^.  Consequently,  the  mapping  T:fi— >Z  is 
G^-measurable.  Hence,  T  is  a  random  variable  on  (fi,  G^)  and  so  on  (fi,H). 

Finally,  it  is  immediate  that  T  is  a  G-stopping  time  iff  [T<n]£Gn  for  all  n  in 
Z.  Just  notice  that 


$ 

•y 


for  all  n,  since  [T  =  k]cGk  is  contained  in  Gn,  for  all  k<n.  Conversely,  [T  =  n) 
=  [T  <  n]  -  [T  <  n-l]  is  in  Gn  .  More  trivial,  but  of  some  interest  for  later 
comparison  to  the  situation  when  the  stopping  times  are  R+  valued  is  that 
[T<n|(C„  iff  |T<n](C,n_, 

VVre  will  return  to  the  topic  of  stopping  times  in  general  after  a  few'  more 
definitions.  In  Chapter  2,  where  the  model  is  more  complex,  we  will  give  several 
more  examples. 

1.3.  Stochastic  Processes,  Previsibiiity  and  Optionality:  Let  Z  be  the  set 

of  non-negative  integers.  A  sequence,  X  =  (Xn,  ncZ),  of  mappings  of  Q  into  the 
set  of  real  numbers  is  called  a  real  valued  stochastic  process  if  for  each  n,  the 
mapping,  w  — *  Xn(w),  of  Q  into  R  is  H- measurable.  That  is,  for  each  n  in  Z. 
{wcO  :  Xn(w)fB]eH,  for  all  real  Borel  sets,  B.  Of  course,  this  is  just  the  statement 
that  for  each  n,  Xn  is  a  real  valued,  random  variable  on  the  measurable  space  (Q. 

H). 

Further,  X  is  said  to  be  G-adapted  ,  if  Xn  is  Gn-measurable  for  each  n  in  Z.  If 
X  is  adapted  to  G,  then  we  also  say  X  is  observable  relative  to  those  processes 
which  generate  G.  It  is  useful  to  realize  that  if  X  is  G-adapted,  then  measurable 
functions  of  successive  finite  segments,  g(Xi,  •  ■  •  ,Xn),  of  X  define  G-adapted 
processes. 

Convention:  Throughout  this  Chapter,  whenever  processes  are  discussed  it  will 
always  be  assumed  that  they  are  adapted  relative  to  the  same  fixed  filtration, 
unless  stated  otherwise.  This  is  no  restriction  in  generality  since  we  have  not 
excluded  the  trivial  filtration,  (Gn,neZ),  where  Gn  —  H  for  all  n. 

For  the  discrete  time  processes,  the  important  notion  of  “previsibility’'  takes  on  a 
very  simple  and  intuitive  meaning:  V  =  (Vn,  n  in  Z)  is  said  to  be  G-previsible. 
if  each  random  variable  Vn  is  Gn  ,  measurable.  This  description  of  previsibility  is 

intuitive  since  if  a  process,  (Vn),  is  G-previsible,  when  Gn  =  <r(Xk,  k  =0. 1 . n). 

for  some  process,  X,  then  Vn  is  a  Borel  function  of  Xk,  for  k =0, 1 . n-l.  Thus. 

the  value  of  the  process  V  at  time  n  is  completely  determined  by  the  value  of  X 

at  the  times  0,1,2 . n-l.  That  is,  just  before  time  n  (  prior  to  n  )  the  value  of  Yn 

is  known;  it  is  previsible.  “Previsible”  is  the  French  term;  in  English  it  is  usually 
translated  to  “predictable”.  We  use  the  former  term  because  the  notion  of  pred¬ 
ictability  as  a  technical  term  carries  too  many  possible  meanings  (e.g..  in  wide 
sense  stationary  time  series  analysis)  and  the  English  interpretation  of  the  term 
“previsible”  ,  viz.,  “being  visible  before ",  rather  precisely  describes  the  intended 
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technical  meaning. 

Later  in  this  chapter  we  will  need  a  reasonably  precise  understanding  of  the 
statement  that  a  (discrete  parameter)  process,  X,  is  “evaluated  at  a  stopping 
time”,  XT(wj(w).  An  immediate  difficulty  that  one  might  notice  is  that  stopping 
times  take  values  in  Z+,  while  for  any  w  in  ft,  n  — ►  Xn(w)  is  defined  only  on  Z+. 
This  can  be  overcome  by  setting,  for  example,  the  value  of  the  process  at 
“infinity”  equal  to  zero,  for  all  w  in  Q.  This  is  equivalent  to  writing  XT(w)(w) 
1  [T<oo]  m  place  °f  ^T(w)(w)-  As  it  is  convenient,  we  will  use  one  or  the  other,  or 
just  qualify  appropriate  statements  by  saying  “  on  [T  <  oo  ]”,  while  writing 
either  XT(wj(w)  or  XT.  In  any  case,  we  then  need  to  know  what  conditions  must 
be  imposed  on  X  so  that  XT  is  a  random  variable.  Hence,  we  must  first  say  what 
is  meant  for  a  random  variable  to  be  defined  on  a  subset  of  Cl.  So  let  Q0  be  a 
subset  of  Cl  of  the  probability  space,  (fl,  H,  P).  The  trace  cr-algebra,  denoted 
Hn^o,  is  the  family  {  Ap|Q0  :  AcH  }  of  subsets  of  fl0.  Of  course,  Q0  may  not 
belong  to  H,  but  if  it  does,  then  the  trace  cr-algebra  is  just  {  A  :  AcH,  A  a  subset 
of  Cl0  }.  Now  we  are  all  set:  X  is  a  real  valued  random  variable  on  a  subset, 
0o,  of  Cl  (on  (H0,  Hp)0o,  P)  )  iff  X'!(B)  c  Hp|fl0  for  all  real  Borel  sets,  B. 
Further,  we  can  talk  about  a  G-measurable  random  variable  or  fraction 
defined  on  a  subset,  fiQ,  where* G  is  a  sub  cr-algebra  of  H,  by  replacing  H  by  G  in 
the  definition  above.  Then  the  following  result  holds  (  Neveu  [1975]  ). 

1.3.1.  Lemma:  If  X  is  a  G-adapted  process  and  T  is  a  G-stopping  time ,  then  the 
random  variable  XT,  defined  on  {w:  T(w)  <  oo}  by  setting  XT (tv)  ;=  Xj^(w)  is 
Gj-measurable. 

The  random  variable  of  this  definition-theorem  is  obtained  as  a  result  of 
evaluating  the  process  at  the  stopping  time  T.  Perhaps  the  most  important 
example  is  that  of  a  stopped  process.  If  T  is  a  stopping  time,  then 
Tn  (w)  :=(Tftn)(w):=T(w)»n,  the  minimum  of  numbers  T(w)  and  n,  defines  a 
stopping  time  for  each  neZ+.  Let  XnT(w)  :=  Xj  (w).  We  define  the  process  X 
stopped  at  time  T  by  setting  XT  ==  (XnT'  neZ+). 

The  paths,  n  — *  XT(w),  of  the  stopped  process  are  constant  to  the  right  of  the 
interval  [0,  T(w)].  Stopped  processes  are  fundamental  to  modern  martingale 
theory.  Later  in  these  notes,  the  notion  of  path-wise  “localization”  of  a  process 
is  introduced,  whereby  properties  such  as  “boundedness”  are  attributed  to  the 
process  “locally”  in  the  sense  that  the  stopped  process  is  bounded.  For  example, 
from  elementary  calculus,  a  path-wise  continuous  process  is  locally  bounded. 
This  technique  becomes  a  powerful  tool  for  extending  certain  results  to  more  and 


more  general  classes  of  processes  and  will  be  used  extensively  in  Chapter  6. 

Another  detail  that  we  need  throughout  is  a  way  to  say  two  processes  are 
“equal".  That  is,  we  need  an  equivalence  relation  on  a  set  of  processes.  Let  X 
and  V  be  two  discrete  parameter  processes  defined  on  the  same  probability  space 
(ft.  H.  P).  Since  any  countable  collection  of  events  of  P-measure  zero  is  again  an 
event  of  P-measure  zero,  the  statements  that 

P(  Xn  =  Yn )  =  1,  neZ+  and  P(  Xn  =  Yn ,  neZ+  )  =  1 

are  equivalent.  This  is  not  true  in  the  continuous  parameter  case  considered  in 
Chapter  2,  where  processes  having  the  first  property  are  called  “modifications"  of 
one  another  and  those  having  the  second  are  called  indistinguishable.  These 
concepts  being  equivalent  for  discrete  parameter  processes,  we  will  only  use  the 
latter  for  now.  Clearly,  indistinguishability  determines  an  equivalence  relation  on 
the  set  of  all  processes  defined  on  (ft.  H.  P).  So  any  processes  or  random  quanti¬ 
ties  which  are  discussed  in  this  chapter  are  only  specified  to  within  membership 
in  a  particular  equivalence  class.  On  occasion  we  will  emphasize  this  point  by 
writing  “a.s.P’t’,  meaning  “almost  surely  relative  to  the  probability  P”.  or.  "with 
probability  one"  next  to  equalities  and  inequalities  involving  random  quantities. 

With  an  eye  toward  later  chapters,  we  also  remark  at  this  point  that  a  process 
which  is  indistinguishable  from  the  process  which  is  identically  zero  is  said  to  be 
evanescent.  Subsets  of  Z+  X  ft,  called  random  sets,  are  said  to  be  evanescent 
if  their  indicator  functions  are  evanescent. 

1.4.  Transforms  of  Stochastic  Processes:  Let  V  =  ( Vn, ne Z )  and  X  = 

(Xn,nfZ)  be  two  processes.  Extend  the  time  domain  of  processes  on  Zxft  by  set¬ 
ting  X_x  =  0  for  all  w  in  ft.  Set  AXk  =  Xk  -  Xk_j,  then  in  particular,  AXo  =  X0 

Given  two  processes  X  and  V,  define  the  process  V.X  on  ft  by  setting 

(V.X)n(w)  :=  VVk(w)  AXk(w)  (1) 

o 

for  all  n  in  Z  and  each  weft.  V.X  is  called  the  transform  of  X  bv  V.  When  we 
want  to  use  the  transform  of  X  by  V  to  anticipate  results  about  stochastic 
integrals,  we  will  sometimes  call  this  transform  a  discrete  integral  of  V  with 


respect  to  X.  In  this  case  we  have  in  mind  that  Yk  =  vtfc  and  Xk  =  x(k. 

k  =  1 , 2 n,  where  0=t0<t,<  •  •  •  <tn=t,  for  some  continuous  parameter 

processes  v  and  x. 

Equation  (1)  is  also  written  in  the  forms 


(V.X)n  (w) 


v0(  w )  x0(w)  +  v  vk(w)  a  xk(w) 

i 


=  (V.X)n  ,(w)  +  Vn  (w)  Xn  (w). 


(2.1) 


As  a  discrete  integral  it  is  clear  that  the  transform  in  (1)  is  nothing  more  than  a 
particular  form  of  a  Darboux  sum  associated  with  a  Riemann  Stieltjes  integral. 
As  such,  in  later  chapters  of  this  note,  it  will  become  the  major  building  block  of 
stochastic  integrals  relative  to  various  types  of  (continuous  time)  processes. 

1.5.  The  Quadratic  Variation  and  Variance  Processes:  We  now  introduce 
two  more  processes  that  play  an  important  role  in  stochastic  integration.  These 
processes  also  form  a  link  back  to  classical  probability  and  statistics. 

Again  let  X  =  (Xn,  neZ)  be  any  process  and  define  the  stochastic  process,  [X,X], 
on  n  by  setting 


[X,X]n 


W 


:=  Xo2(w)  + 


V  (Xk(w)-Xk.,(w)  )2=  E 

1  o 


A  Xk  (w))2, 


(3) 


for  all  n  in  Z  and  each  wcQ.  (  Recall  that  X_j  :=  0.  )  The  increasing  process 
[X,X]  is  called  the  quadratic  variation  of  X.  Some  writers  ingeniously  call  it 

square  brackets  X. 

If  V  is  any  other  process  parameterized  by  Z,  we  define  the  cross  quadratic 
variation,  [X,Y],  by  polarization 

[  X,Y  ]  :=  I  (  (  X  +  Y  ,  X  +  Y  ]  -  [  X,X  ]  -  [  Y.Y  ]  ).  (  0 


By  elementary  manipulations,  this  definition  is  equivalent  to  setting 


[X.YI„  VAXk  AVk.  01 

0 

Now.  we  assume  that  E(Xn2)  <  oc  for  each  n  in  Z;  that  is,  Xc  L.>(P).  Let  G  = 
(Gn  n<Z+)  be  the  underlying  filtration  for  the  processes  in  this  section  and  define 
G_,  =  G0.  Then  set 


<X,X>n  :=  V  E{(  AXk)2  |  Gk_t  } 
o 


for  each  n  >  0.  For  now,  it.  is  appropriate  to  call  <X,X>  the  variance  pro¬ 
cess.  Clearly,  both  <X,X>  and  [X.X]  are  increasing  processes.  It  is  important 
to  note,  however,  that  [X.X]  always  exists,  but  <X,X>,  as  it  has  been  defined. 


exists  only  when  X  has  finite  second  moments.  Finally,  note  that  <X,X>  is  a 
G-previsible  process,  whereas  [X,X|  is  only  G- adapted,  ){  **  Q\ 


The  covariance  process,  <X,Y>  is  defined  by  polarization,  as  in  the  case  of 
the  quadratic  variation,  and  leads  to  a  formula  analogous  to  equation  (5). 


1.5.1.  With  the  notational  agreements  made  at  the  beginning  of  the  section, 
notice  that  we  can  write  any  process  in  the  form 


where  the  sequence  dk:=AXk  is  called  the  difference  process  associated  with  X. 

1.5.2.  Example:  Assume,  for  this  paragraph,  that  the  dk  a:e  independent  of 
Gk_i,  with  d0  independent  of  G0,  and  have  zero  mean  value  and  finite  variance. 

<rk2.  Then  E{dk2  |  Gn_i  }  =  Edk2  =  crk2,  so  that  <X,X>n=^]txk2.  That  is, 

o 

<X,X>  is  the  variance  of  the  process  X.  Thus,  if  Xn  is  a  sum  of  zero  mean  ran¬ 
dom  variables  which  are  independent  of  the  “p-ist”  and  have  finite  variance,  then 
<X,X>n  reduces  to  an  increasing,  deterministic  process  which  is  equal  to  the 
variance  of  Xn.  For  example,  if  X  is  the  random  walk  of  the  previous  example, 
then  the  (dk)  are  independent,  symmetric  Bernoulli  random  variables. 


Further,  if  Y  is  another  process  whose  difference  process  has  the  same  properties 
as  those  of  X  in  this  example,  then  it  is  easy  to  see  that  <X.Y>„  is  just  the 
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covariance  of  X  and  Y,  at  time  n. 


Of  course,  <X,X>  is  in  general  not  a  deterministic  process,  but  it  is  always  an 
increasing  stochastic  process.  As  such,  it  is  perhaps  more  honestly  referred  to  as 
a  “stochastic  measure”  in  that  its  properties  derive  more  from  the  fact  that,  on 
each  path,  its  increments  define  a  positive  measure  of  the  algebra  of  all  subsets  of 
Z. 

1.6.  Martingales: 

1.6.1.  Definition:  Let  (n,H,(Gn),P)  be  a  filtered  probability  space.  A  G- 
martingale  is  a  sequence  (Mn,  ntZ)  of  random  variables  on  ft  with  the  following 
properties: 


(a)  M  =  (Mn(w))  is  adapted  to  G 

(b)  E{|Mn|}  <  oo,  for  all  ncZ 

(c)  E{Mn  |  Gb.,  }  =  M„_„  a.s.P , 
for  all  n  in  Z. 

By  definition  of  conditional  expectation,  condition  (c)  is  equivalent  to  requiring 
that  for  all  AcGn_1 


(e'  )  J  M„  dP  =  J  M„_,  dP. 

A  A 

If  the  equality  in  (c)  or  ( c’ )  is  replaced  by  <,  or  >,  then  M  is  called  a  G- 
supermartingale,  or  a  G-submartingale,  respectively.  When  the  underlying 
filtration,  G,  remains  fixed  in  a  particular  discussion  we  will  often  drop  the 
qualifier  G  and  just  write  “martingale”  or  “supermartingale”  or  “submartingale”. 

It  follows  from  the  definition  that  a  martingale  satisfies 

Mk  =  E(Mn  |  Gk), 

for  every  pair,  (k,n),  of  nonnegative  integers  with  k<n,  not  just  neighboring 
integers.  Similar  statements  hold  for  supermartingales  and  submartingales.  To 
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see  this  in  the  case  of  supermartingales  just  use  the  fact  that  filtrations  are 
increasing  and  conditional  expectations  are  smoothing  operators  and  proceed  as 
follows:  Mn>E(\ln+,  |  C.n),  so  that  if  k<n 

E(M„|Ck)  >  E(E(M„+1  |  G„)  |  Gk)  =  E(M„+1  |  Gk|. 

Thus,  (E(Mn  |  Gk),n>k)  is  a  decreasing  sequence.  Hence,  supermartingales 
decrease  in  conditional  mean  and  so 

Mk  =  E(Mk|Gk)  >  E(Mn  |  Gk). 

Similarly,  submartingales  increase  in  conditional  mean  and  martingales  are  con¬ 
stant  in  conditional  mean  with  the  same  obviously  being  true  in  the  case  of  the 
unconditional  means. 

1.6.2.  Remark:  There  are  some  immediate  results  about  martingales  that  are 
simple  to  verify  and  are  used  constantly.  As  usual,  a  single  underlying  filtration 
is  assumed  in  each  statement. 

o  If  M  and  N  are  martingales,  then  M  +  N  is  a  martingale. 

o  If  0  is  a  real  valued  convex  function  defined  on  Rt,  M  is  a 
martingale  and  <p(Mk)  has  finite  expectation,  then  (0(\lk))  is 
a  submartingale. 

o  If  M  is  a  martingale  which  is  square  integrable  relative  to 
P,  then  M2  -  [M,V1]  is  a  martingale.  Also,  M2  -  <M,M>  is  a 
martingale. 

All  but  the  second  statement  follows  by  straightforward  computation  using  the 
definition  of  the  quantities  involved. 

The  second  statement  requires  Jensen’s  inequality.  This  is  based  on  a  result 
about  real  valued  convex  functions  which  states  that  there  exist  affine  maps, 
0n  =  an  x  +  bn,  such  that  <t>  =  sup  <i>n.  Using  the  monotonicity  and  linearity 
of  the  conditional  expectation  operators,  we  obtain 

E  U  (X)  |  G  )  >  E  Un  (X)  |  G  )  =  dn  (  E  (  X  |  G  )). 
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Jensen's  inequality  follows:  E  (  <5  (X)  |G)  >  <t>  (  E(  X  |  G)).  Bvreplacing 
X  and  G  by  \lk  and  Gk_j,  and  using  the  fact  that  M  is  a  G-martingale,  we  obtain 
the  result.  Our  applications  include  the  important  case  <j>  (x)  =  x2. 

We  have  encountered  some  basic  martingales  earlier.  Let  X  =  (Xn)  be  written  as 
in  equation  (7),  and  give  the  sequences  of  differences  (dk)  the  assumptions  in  the 
paragraph  following  (7).  Then  X  is  a  martingale,  since 

E{  X„  |  G„_,  }  =  E{  d„  |  G„.,  }  +  E{  X„_,  |  G„_,  }  (8) 

and 

E{  dn  I  ^n-i  }  =  0.  E{  Xn,,  |  Gn_j  }  =  Xn_j,  a.s.P.  (9) 

The  first  of  these  equations  is  due  to  the  fact  that  we  took  the  difference  sequence 
to  be  independent  of  the  past  and  have  zero  expectation  ( i.e. ,  the  difference 
sequence  is  centered  at  conditional  expectations).  The  second  is  a  result  of  the 
fact  that  X  is  adapted  to  G.  Because  then,  Xn_i  is  Gnl-measurable,  and  it  is  a 
property  of  conditional  expectations  that  E{f|K}  =  fE{l|K}  =  f,  a.s.P,  whenever 
f  is  K-measurable.  Putting  equations  (8)  and  (9)  together  verifies  the  claim  that 
X  is  a  martingale. 

Finally,  it  should  be  clear  that  we  didn’t  need  the  finite  variance  assumption  on 
the  difference  sequence;  this  was  only  assumed  in  the  original  example  because  we 
wanted  to  give  an  example  about  the  variance  process.  In  fact  from  equation  (8) 
it  follows  that  if  the  dk  have  finite  expectations  and  are  centered  at  expectations 
conditioned  on  Gk_j,  then  X  is  an  F-martingale. 

1.7.  Doob’s  Theorems:  Another  example  of  a  martingale  does  assume  that  the 
Xn  has  finite  variance,  but  that  is  all.  Then 

[X.X]-<X.X>  (10) 

is  a  martingale  relative  to  the  filtration  G.  This  follows  directly  from  the  explicit 
form  for  the  quadratic  variation  and  the  variance  process  for  exactly  the  reasons 
that  our  first  example  was  a  martingale.  Although  this  is  true  in  the  continuous 
case  also,  it  will  follow  from  the  Doob-Meyer  decomposition  and  will  constitute 
the  definition  of  the  process  <X,X>. 


It  will  be  beyond  the  scope  of  this  note  to  even  outline  the  proof  of  the  continu¬ 
ous  time  Doob- Meyer  decomposition.  Therefore,  we  will  give  a  proof  of  the 
decomposition  theorem  in  the  discrete  case.  This  has  the  added  advantage  that  it 
is  simple  to  prove  and  its  proof,  along  with  the  statement  of  the  result,  will  allow 
us  to  introduce  a  number  of  concepts  which  are  quite  difficult  in  the  continuous 
time  analogue. 

1.7.1.  Lemma:  (Uniqueness  of  the  Doob-Meyer  Decomposition) 

If  a  process  X  =  (Xn,  neZ)  can  be  written  in  the  form  X  =  M  +  .4,  where  M  = 
(Mn)  is  a  G-martingale  and  A  =  (An)  is  a  G-previsible  process ,  with  =  0,  then 
the  representation  is  unique  (up  to  indistinguishability). 

Proof:  Suppose  that  two  representations  exist  :X  =  M  +  A  =  m4-a,  where  m 
and  a  have  the  same  properties  as  M  and  A.  Then  M  -  m  =  A  -  a  demands  that 
M  -  m  is  a  previsible  martingale.  This  implies  that  Mn  -  mn  =  E{  Mn  -  mn  |  Gn_, 
}  =  Mn_j  -  mn_j.  Hence,  Mn  -  mn  =  M0  -  m0.  Finally,  since  the  last  quantity  is 
equal  to  Aq  -  a0  =  0,  Mn  =  mn,  a.s.P;  hence,  M  -  m  is  evanescent.  This  of  course 
implies  the  same  for  A  -  a. 

Notice  that  we  have  also  proved  the  interesting  and  useful  fact  that  previsible 
martingales  are  constant  a.s.P.  A  similar  statement  is  true  in  continuous 
time  (Chapter  4),  but  requires  an  enormous  amount  of  machinery  to  prove. 

1.7.2.  Theorem:  (Doob  Decomposition) 

Let  X  =  (XJ  be  an  LX(P),  G-adapted  stochastic  process.  Then  there  exist 
processes  M  and  .4,  where  M  is  a  martingale  and  A  is  previsible  with  A0  =  0. 
such  that  X  =  M  +  A.  This  representation  is  unique  (modulo  indistinguishabil¬ 
ity)- 

Bt-c?use  of  the  previous  Lemma  the  proof  of  this  statement  just  consists  in 
observing  that  we  can  write 

X„  -  XV,  =  X„-E(Xn  |  Gn_|  )  +  E(  Xn  Xn_,  |  (11) 

It  follows  that  X  =  M  +  A,  where 

M„  =  X„+  t  (Xk-EIX,  |  G,.,)> 

k«=>l 


(12) 


and 


An  —  ^  E(  Xr  Xk.,  |  Gk_t  )  ,  Ao  —  0.  (13) 

k=l 

Clearly,  M  is  a  martingale  and  A  is  previsible.  Of  course,  all  these  equations  hold 
with  probability  one  only. 

The  process.  A,  of  Doob’s  decomposition  is  called  the  “compensator”  of  the  pro¬ 
cess.  X,  according  to  the  following.  Let  X  be  a  P-integrable  process.  Then  the 
process.  X.  defined  by  setting  AX  n  =  E(AXn  |  Gn_2)  for  n>l,  X0  =  0.  is 
called  the  compensator  of  the  process,  X.  If  in  addition  to  P-integrability,  X  is 
G-adapted,  then  X  is  obviously  characterized  by  the  following  three  properties: 

(a)  X  -  X  is  a  G-mart  ingale: 

(b)  X  is  a  G-previsible  process; 

(c)  X  o  =  0. 

Compensators  will  be  examined  in  some  detail  in  Chapter  4. 

The  following  corollary  is  immediate  and  is  of  the  form  stated  in  the  sequel, 
where  the  index  set  is  a  continuum: 

1.7.3.  Corollary:  (Doob-Meyer  Decomposition  Theorem) 

//  .V  is  a  G-submartingale,  then  there  exist  processes  M  and  A ,  where  M  is  a  G- 
martmgale  and  A  is  an  increasing,  G-previsible  process  with  Ag  —  0,  such  that 
X  =  M  +  A,  uniquely  (modulo  indistinguishability). 

The  only  part  that  now  requires  proof  is  the  statement  that  A  is  an  increasing 
process.  Since  X  is  a  submartingale,  this  follows  immediately  from  the  definition 
of  A  in  equation  (13)  written  in  the  form  An  =  An_[  +  E(Xn|Gn)  -  Xn  ,  >  An_,. 
a.s.P.  (When  .^(w)  =  0  and  An(w)  >  An_,(w)  for  P  almost  all  w  in  Q  and 
n>l,  the  process  A  is  said  to  be  an  increasing  process.) 

1.7.4.  Remark:  Immediately  following  the  definition  of  martingales  we  pointed 
out  that  when  M  is  an  L2  martingale  (so  that  by  Jensen's  inequality,  M2  is  a  sub¬ 
martingale),  both  M2  -  [M.M]  and  M2  -  <M,M>  are  martingales.  Since 
<M,M>  is  previsible,  it  follows  from  the  uniqueness  of  the  Doob-Meyer  Decom¬ 
position  that  M2  =  m  4-  <M,M>  is  the  decomposition  specified  by  the 
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Corollary,  the  Doob-Meyer  decomposition  of  M2.  Thus,  <M,  M>  is  a  previsible 
process  which  "compensates-'  for  M2  not  being  a  martingale,  even  though  M  is 
one.  Indeed,  the  process  <M,M>  is  the  compensator  of  M2  This  is  because 

E(A(M2)|Gnl)  =  E(  (AMn)2  |  Gn_,)  =  <M,M>. 

1.7.  a.  Another  way  of  visualizing  the  DMD  Theorem  is  to  recall  that  on  the 
average,  submartingales  rise.  That  is,  n  — ►  EXn  is  an  increasing  function  on  Z+. 
The  HMD  Theorem  says  that  A  accounts  for  this  proclivity  to  rise  by  previsibly 
compensating  X  to  produce  a  martingale.  X  -  A,  which  of  course  has  constant 
expectation. 

Remark:  Processes  of  the  form  X  =  M  +  A,  where  M  is  a  martingale  and 
A  is  an  increasing  process,  are  special  cases  of  a  class  of  processes  called  semi¬ 
martingales  in  the  sequel.  When  the  decomposition  is  unique  (to  within  dis- 
tinguishability).  then  X  is  called  a  special  semi-martingale.  Hence,  the  Doob- 
Meyer  Theorem  states  that  submartingales  are  a  particular  form  of  special  semi¬ 
martingale.  This  is  a  very  convenient  interpretation  from  the  standpoint  of 
applications  since  a  semi-martingale  is  just  a  mathematical  model  for  a  dynami¬ 
cal  system  which  consists  of  a  ‘'signal”  or  “trend”  term,  A,  and  a  “noise”  term. 
M 

It  is  easily  seen  that  the  Doob-Meyer  Theorem  also  holds  when  X  is  a  supermar¬ 
tingale.  We  need  only  write  X  =  M  -  A  in  order  to  maintain  the  property  that  A 
is  an  increasing  processes.  Again,  on  the  average  supermartingales  fall  and  A 
previsibly  compensates  to  produce  X  +  A,  which  has  constant  averages. 

1.7.7.  Remark:  Since  engineers  have  been  using  the  “signal  plus  noise”  model  for 
decades,  it  is  probably  worthwhile  to  take  a  moment  to  understand  why  they 
have  been  so  successful  (and  to  acknowledge  the  generality  of  their  achievement). 
The  DMD  Theorem  states  that  any  discrete  time  process  with  finite  mean  that  is 
observable  relative  to  some  filtration  (flow  of  information.  \Y  ong  [1073])  is  a 
semi-martingale.  In  fact,  if  X  =  (Xn,ntZ)  is  any  finite  mean  process  and  (Fn)  is 
any  information  flow,  then  the  sequence  O  n),  where  Yn  =  E(Xn  |  Fn).  (i.e.,  what 
is  observable  about  X  relative  to  available  information),  can  be  shown  to  b<>  a 
semi-martingale.  It  took  mathematicians  a  while  to  understand  all  this  and  then 
to  do  what  their  discipline  demands,  namely,  explain  the  reason  why  “signal  plus 
noise”  models  were  important,  from  a  viewpoint  other  than  “the  model  seems  to 
work”  This  note  is  in  some  sense  shows  the  lengths  to  which  mathematicians 
have  gone  in  the  last  30  or  so  years  to  explain  the  full  significance  of  semi- 
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martingales  (in  continuous  time),  including  their  construction  of  a  calculus  in 
study  these  processes  in  their  most  general  form  and  at  the  same  time  to  provide 
scientists  with  the  correct  tools  to  model  stochastic  dynamical  systems.  Only 
time  will  tell  whether  or  not  the  resulting  mathematical  theory  is  technically  too 
diflicult  for  applications. 

1.7.8.  We  will  now  return  to  the  initial  reason  for  this  chapter:  to  introduce  and 
study  transforms  of  martingales,  called  martingale  transforms,  in  an  effort  to 
set  an  intuitive  foundation  for  the  development  of  stochastic  integrals. 

1.7.9.  Theorem. 

Let  X  be  a  martingaL  (supermartingale,  submartingale).  If  V  is  a  nonnegative, 
previsible  process  and  the  transform  of  X  by  V  is  P-integrable,  then  \'.X  is  a  mar¬ 
tingale  (supermartingale,  submartingale). 

The  proof  of  this  very  important  result  is  an  immediate  consequence  of  the 
second  representation  of  a  transform  in  equation  (2): 

E{  (V.X)„  -  (V.X)„_,  |  Gn_|  |  =  E{  V.  (  X„  -  X„_,  )  |  G„_,  ) 

=  VnE(  X„  -  X„.,  |  G„_,  ). 

The  right  side  of  this  equation  =  0,  <  0  or  >  0,  depending  on  whether  X  is  a 
martingale,  a  supermartingale  or  a  submartingale,  respectively.  The  result  fol¬ 
lows  since  (V.X)n  ,  is  G^-measurable. 

1.7.10.  Corollary. 

If  T  is  a  G-stopping  time,  and  X  is  a  martingale  (supermartingale,  submartingale ), 
then  the  stopped  process,  XT,  is  a  martingale  (supermartingale,  submartingale). 

It  is  easy  to  see  that  XT  =  V.X,  when  Vn  =  l[n<T]-  To  show  that  V  is  G- 
previsible  just  write 


[n<T]  =  (  Jj  [T  =  k]  )c  fGn_j. 

k  =  l 

Thus,  the  indicator  function  of  [n<T]  is  Gn_,-measurable.  so  that  V  is  G- 
previsible.  It  only  remains  to  show  that  XT  is  P-integrable.  Since  T(w)~n<n. 
this  a  consequence  of 


i 

|XnT(w)|<V|Xk(w)|. 

|  0 

1.7.11.  Remark:  Subsets  B  of  Z+Xft  are  called  random  sets.  In  the  sequel 
such  sets  will  be  called  previsible  random  sets  if  their  indicator  processes, 
(n,w)— ►lg(n.w),  are  previsible  processes.  Anticipating  a  concept  that  will  be 

^  introduced  in  Chapter  2,  we  point  out  that  if  T  is  a  stopping  time,  then  random 

sets  of  the  form  {  (n,w)  :  n  <  T(\v),  (n,w)  c  Z+  X  ft  }  are  previsible  random  sets. 
This  random  set  is  a  particular  example  of  a  stochastic  interval,  denoted 
Ml.  In  this  instance,  we  would  write  V  =  l[[o  T]]  35  a  process  defined  on 
Z+  X  ft.  Notice  that  in  the  proof  of  the  last  theorem  we  wrote  Vn  =  l[n<T|- 
thereby  defining  the  process  V  by  means  of  a  sequence  of  random  variables  on  ft. 
These  two  ways  of  defining  the  same  process  leads  to  nothing  new  in  discrete 
time,  but  once  we  enter  the  continuous  time  domain  we  will  find  that  studying 
processes  as  families  of  random  variables  will  not  be  adequate.  It  will  turn  out 
that  stochastic  intervals  will  provide  an  intuitive  way  of  studying  the  measurabil¬ 
ity  of  such  processes  as  a  functions  of  two  variables. 

1.7.12.  Remark:  It  is  convenient  at  this  point  to  add  the  following  Corollary. 
This  form  of  Doob’s  Optional  Sampling  (Stopping)  Theorem  (1953)  is  not  stated 
in  its  most  general  form,  but  it  is  sufficient  for  our  purposes.  The  boundedness 
condition  imposed  on  the  stopping  times  can  be  relaxed;  such  a  form  of  Doob's 
theorem  (in  the  continuous  parameter  case)  will  be  stated  in  Chapter  2.  Page  67 
in  Neveu  [1975]  contains  the  discrete  parameter  version. 

1.7.13.  Theorem  (Doob’s  Optional  Sampling  Theorem). 

If  A  is  a  martingale,  and  S,  T  are  bounded  stopping  times  with  S  <  T  ,  (hen  Xg 
and  XT  are  P-mtegrable  and 

E{  Xj  |  Gs  }  =  Xs  ,  (  a.s.P  ).  (14) 

(  T  is  a  bounded  stopping  time  if  there  exists  a  constant,  K,  such  that  T(w)  < 
K  for  all  w  in  ft.  ) 

17  14.  For  the  proof,  just,  set  Vn  =  l[s<n<T|-  then  Vn  =  l|n<T)  - 
Csing  the  obvious  linearity  of  transforms,  and  realizing  as  in  the  proof  of  the  pre¬ 
vious  Corollary  that  V.X  is  P-integrable,  this  Corollary  states  that  Y  :=  Y.X  is  a 
martingale,  which  satisfies  Yn  =  XnT  Xns  and,  in  this  case,  satisfies  Y0  =  0. 
Because  of  the  boundedness  condition,  we  can  choose  a  positive  integer  m  such 
that  m  >  max(S.T)  on  ft.  Then  Ym  =  XT  Xs.  Therefore,  0  =  EY0  =  FYm  = 
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E(Xx  Xs).  It  is  a  simple  exercise  to  show  that  E(XT)  =  E(XS)  is  equivalent  to 
equation  (14).  Let  A  be  any  element  in  Gs.  Define  the  stopping  times  S'  and  T’ 
by  setting  S'  =  SlA  4-  m  lAo  and  T  =  TlA  +  m  lAc.  Then  S’<T’,  and  so  we 
again  have  0  =  EfX^  -Xs<  ),  which  by  definition  of  S’  and  T’  can  be  written  in 
the  following  form:  0  =  E(  1A(  Xx  -  Xs  )).  Referring  to  the  definition  of  condi¬ 
tional  expectation,  this  last  equation  is  exactly  the  statement  in  equation  (14). 


1.7.15.  Remark:  Recall  the  random  walk  example  given  at  the  beginning  of  this 
Chapter.  The  symmetric  random  walk,  Xn,  is  a  martingale  and  so 
EXo  =  0  =  EXn.  If  we  define  the  stopping  times  T  :=  min(n:X„  =  l)and 
S  =  0,  then  S<T,  but  since  P(XT  =  1)  =  1,  we  have  that 


EX-p  =  1  7^  EXS  =  0. 


The  problem  is  that  T  is  not  a  bounded  stopping  time.  Of  course, 
P(T<oc)  =  1  since  the  random  walk  is  recurrent. 


1.7.16.  Recall  the  decomposition  given  in  the  first  remark  following  the  Doob 
Meyer  Decomposition  and  apply  the  Optional  Sampling  Theorem  to  the  mar¬ 
tingale  M.  Then 


Gs  )  —  E(  A-j 


I  Gs), 


where  S  and  T  are  bounded  stopping  times  with  S<T.  Aldous  [1081)  then  gives 
the  following  partial  converse  to  the  Doob- Meyer  Decomposition  Theorem: 


1.7.17.  Corollary: 

Let  X  be  a  submartingale  with  Xq  =  0  and  .4  a  previsible  process.  If 
E  X-p  =  E  A-p,  for  all  bounded  stopping  times  T,  then  A  is  the  compensator  of  A  . 


For  the  proof,  just  set  M  =  X  -  A.  Then  EM0  =  0,  so  that  EMT  =  0.  for  each 
stopping  time  T.  As  in  the  proof  of  the  Optional  Sampling  Theorem. 
E(MX  -  Mg)  =  E(lp(Mx  Ms)),  for  all  F  in  Os.  Hence.  M  is  a  martingale 
and.  therefore,  A  is  the  compensator  of  X. 


1.7.18.  Remark:  If  X  is  a  supermartingale,  then  the  theorem  continues  to  hold 
with  the  equality  in  equation  (14)  replaced  by  “<".  Similarly,  if  X  is  a  submar¬ 
tingale,  then  the  equality  is  replaced  by  ">".  To  appreciate  the  importance  of 
this  result,  it  should  be  noted  that  Abraham  Wald  s  theory  of  sequential  testing 


vy. 


tiLl  O'V 
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is  based  on  this  theorem. 


1.7.19.  There  is  an  extremely  important  collection  of  results  on  the  convergence 
of  martingale  (super  and  submartingale)  sequences  together  with  the  fundamental 
inequalities  of  Doob  (  the  Maximal  Inequality)  and  others,  that  could  be  men¬ 
tioned  at  this  point.  The  interested  reader  should  consult  Neveu,  1975.  Some  of 
these  results  will  be  mentioned  in  Chapter  2  and  used  in  the  sequel. 

1.7.20.  We  now  return  to  the  martingale  transform  proper,  and  the  quadratic 
variation  and  variance  processes. 

1.8.  Calculus  of  Martingale  Transforms:  One  of  the  simplest  and  most  use¬ 
ful  relationships  involving  transforms  is  integration  by  parts.  Let  X  =  (Xn) 
and  V  =  (Vn)  be  processes  on  (HT,^).  Define  the  process  X.  from  X  by  setting 
(XJk  :=  Xj^j.  Then  the  integration  by  parts  formula  is 

(  V-X)n(w)  +  (X_.V)»  =  Vn(w)Xn(w).  (15) 

The  proof  of  (15)  follows  immediately  from  the  definition  of  a  transform  by  by 
writing  down  the  formulae  for  the  left  side  of  equation  (15)  and  verifying  that  the 
result  is  a  telescoping  sum  that  reduces  to  the  product  on  the  right  side  of  (15). 

Observing  that  transforms  are  bilinear  and  writing  Vk  =  Vk_j  +  AVk,  we  see 
that  V.X  =  (VJ.X  +  AV.X.  Therefore,  we  can  write  the  integration  by  parts 
formula  in  the  more  symmetric  form 


x„  v„  =  (  V_  .  X  )„  +  (  X_  .  V  |„  +  V  AVkAXk. 

0 

We  h  ave  already  encountered  the  last  term  in  this  equation,  namely  the  cross 
covariation  process  corresponding  to  V  and  X.  Thus,  for  future  reference  we 
state  the  following 

1.8.1.  Theorem  (Integration  by  Parts): 

x„  v„  =  (  V_  .  X  )„  +  (  X_ .  V  |„  +  [Y.X|„. 

This  form  of  integration  by  parts  would  coincide  exactly  with  tin1  familiar 
Riemann-Stieltjes  or  Lebesgue-Stieltjes  form,  if  we  had  defined  [X.X]  as  a 


summation  from  1  to  n  instead  of  0  to  n.  Then  we  would  have  the  usual 
Xn  Vn  -  Xq  V0  on  the  left  side  of  the  last  equation.  We  will  return  to  this  topic 
in  Section  3.2. 

1.8.2.  Examples: 

n 

(1)  Xn  =  V]  dk  ,  where  the  r.v.’s  dk  are  arbitrary.  Then,  using  integration  by 
o 

parts, 

X„J  =  2{  X_.X)  +  [X,X]„, 

By  substitution,  into  this  equation  we  obtain  the  following  well-known  formula 
from  linear  algebra: 

(E  4)2  =  2  E  +  E  dk2. 

0  0<j<k<n  0 

a  classical  formula,  but  obtained  here  as  the  sum  of  the  discrete  stochastic 
integral  of  X  relative  to  itself  and  the  quadratic  variation  of  X!  When  we  com¬ 
plete  Chapter  6  and  have  a  stochastic  integral  for  continuous  parameter 
processes,  we  will  realize  that  this  formula  in  X  continues  to  hold  in  exactly  the 
same  form.  In  particular,  when  X  is  the  Brownian  motion  process,  we  will  see 
that  [X,X](t)=t.  So  the  formula  will  read 

t 

X2(t)  =  2  /  X(s)  dX(s)  +  t 

o 

and  Ito's  stochastic  integral  will  not  follow  the  “usual”  rules  of  calculus.  Kiyosi 
Ito,  the  creator  of  the  stochastic  integral  relative  to  the  Brownian  motion  process 
B,  a  martingale,  designed  this  integral  to  have  the  property  that  the  process 

t 

t— ►JgdB  =  (g.B)(t)  is  a  martingale,  for  a  useful  class  of  processes  g.  This  had 
o 

the  consequence  that  a  number  of  the  rules  of  ordinary  calculus  do  not  carry  over 
to  the  Ito  integral.  The  generalization  of  Ito’s  stochastic  integral  to  one  with  a 
martingale  integrator  (Chapter  6)  retains  these  characteristics. 

A  Russian  mathematician,  R.  Stratonovich,  modified  Ito’s  definition  slightly  and 


produced  a  stochastic  integral  which  followed  the  usual  rules  but  necessarily  lost 
the  martingale  property.  This  makes  the  above  discrete  application  all  the  more 
interesting:  the  “Ito"  integral  may  have  its  most  natural  setting  in  the  discrete 


It  is  an  amusing  exercise  to  define  a  discrete  analogue  to  the  Stratonovich  sto¬ 
chastic  integral:  Let  X  and  V  be  arbitrary  discrete  parameter  processes  and  set 


n  Vk  +  Vk_, 

(V:X)n  :=  V  (  -  — )  AXk. 

o  *■ 


Then  one  can  immediately  find  the  following  relationship  between  the  Ito  and 
Stratonovich  transforms: 


(V:X)„  =  (V.X)„  -  4tV,X]n. 


The  same  relationship  continues  to  hold  between  the  Stratonovich  and  Ito  sto¬ 
chastic  integrals  in  the  case  of  continuous  parameter  processes. 

In  Chapter  6,  after  we  have  formally  introduced  the  Brownian  motion  process 
and  stochastic  differential  equations,  we  will  see  that  another  correction  factor 
arises  when  one  attempts  to  approximate  an  Ito  stochastic  differential  equation 
by  replacing  the  Brownian  motion  term  with  a  member  of  a  sequence  of  smooth 
processes.  If  this  sequence  converges  to  Brownian  motion,  then  (under  certain 
conditions)  the  corresponding  sequence  of  differential  equations  converges  in  the 
mean  to  a  process  which  satisfies  a  stochastic  differential  equation  which  differs 
from  the  original  one  by  a  term  called  the  Wong-Zakai  factor  (see  E.  Wong  and 
M.  Zakai  [1965]). 

We  conclude  this  example  by  illustrating  that  the  (discrete)  Stratonovich  integral 
obeys  the  classical  rules  of  calculus  in  a  simple  special  case.  Set  V=X  in  the  last 
equation  and  substitute  Xk  =  AXk  +  Xk_j  into  the  integrand  of  our  transform 
on  the  right  of  this  equation  to  obtain 

2(X:X)„  =  2(X_.X)n  +  [X,X]„  =  Xn!. 

The  equality  on  the  right  is  due  to  the  integration  by  parts  formula  derived  ear¬ 
lier.  So,  as  in  ordinary  calculus,  the  discrete  “Stratonovich  integral”  of  X  with 


w*.  J i  _»  >  .*  .*  ,♦  ->  A:.  ; v  -  %  *  •  ‘  A.  «  .  **  -  •  v  /*  • 


respect  to  X  is  just  X-squared  over  2. 

(2)  The  following  process,  N,  is  called  a  discrete  point  process  and  will  be  the 

n 

subject  of  the  end  of  this  Chapter.  Let  N’n  :=  Vdk  ,  where  the  dk  are  random 

o 

variables  with  values  in  {0.1},  Bernoulli  r.v.’s.  Let  (Fn)  be  a  filtration  and  Xk  — 
E(  dk  |  Fk-l  ).  A  moment’s  reflection  will  lead  one  to  conclude  that 

lN.N]n  =  Nn 

So,  using  the  last  theorem,  we  have  the  interesting,  nonelassical  formula 

N2  =  2(N..N)  +  N  . 

Notice  that  this  also  gives  us  an  example  of  a  simple  process  whose  variance  pro¬ 
cess  is  not  deterministic: 


<N,N>n 


n 


vx 


k- 


o 


1.9.  Properties  of  Martingale  Transforms:  We  now  collect  some  additional 
transform  properties  which  will  play  an  important  role  in  the  chapter  on  stochas¬ 
tic  integration. 

1.9.1.  Theorem: 

Let  T  be  a  stopping  time  and  H,  V,  Y  and  X  stochastic  processes  defined  on  the 
same  filtered  probability  space.  Then 


A(V.X)  =  V  AX; 

(a) 

[V.X.H.Y]  =  VH.[X,Y]; 

(b) 

H.(V.X)  =  (HV).X; 

(c) 

X  previsible  — *  V.X  previsible; 

(d) 

(V.X)T  =  (V.XT)  =  (VT.XT); 

(e) 

[Y.X]T  =  [VT,XT]  =  [  V,XT] ; 

(0 

(x  )T  =  (xTj . 

(g) 

1.9.2.  Remarks:  These  statements  are  important  for  later  developments  of  the 
stochastic  integral  and  its  attendant  calculus.  In  the  discrete  case  the  ease  with 


which  they  can  be  proved  belies  their  importance.  But  before  demonstrating  this 
fact  we  will  say  a  few  words  about  their  meaning.  If  we  interpret  the  first  state¬ 
ment  in  continuous  time,  anticipating  Chapter  6,  with  AYt  =  Yt  Yt  .  where 
Yt  :=  lim  Ys,  and 

s—  t- 

t 

(V.X),  =  /  V,  dX,, 

0 

then  A(V.X)t  =  VtAXt  means  that  “jump”  points  of  the  integral  are  due 
entirely  to  the  jump  points  of  the  integrator,  not  the  integrand.  The  integrand 
only  affects  the  magnitude  of  the  jump.  The  same  statements  apply  to  the 
discrete  parameter  processes  being  considered  in  this  chapter  if  we  say  that  a 
transform  has  a  “jump”  at  time  n  iff  A(V.X)n  ^  0.  The  interesting  thing  to 
note,  here  and  as  we  pass  through  the  various  types  of  processes  on  our  way  to 
the  general  stochastic  integral,  is  that  these  and  many  other  properties  of 
transforms  continue  to  hold  at  each  step.  This  is  very  important,  because  after 
the  Lebesgue-Stieltjes  stochastic  integral  the  definitions  of  “integral”  may  at  first 
bear  little  resemblance  to  the  traditional  notions  of  such  things. 

As  to  the  proofs  of  these  statements  in  the  context  of  this  chapter,  the  first 
amounts  to  noting  that  A(Y.X)n  is  just  the  nth  term  of  the  sum,  V.X  . 

Part  (b)  of  the  theorem  follows  immediately  from  (a).  For  simplicity  take  Y=H 
and  X=Y.  Since  the  general  term  of  [V.X, V.X]  is  (  A(V.X)J2  and  this  equals 
(  Vn  AXn)2  =  V2  (  AXn)2.  Then  (b)  follows  bv  observing  that  this  is  the  general 
term  of  V2  .  [X.X], 

Parts  (c)  and  (d)  are  immediate  consequences  of  the  definition  of  a  transform.  In 
particular.  Part  (d)  has  the  corollary  that  if  T  is  a  stopping  time  and  X  is  previsi- 
ble,  then  XT  is  previsible. 

Now  Part  (c)  can  be  used  to  prove  Part  (e).  For  instance,  to  verify  this  claim, 
take  Hn  =  I[T>n|.  Csing  Part  (e),  we  obtain 

(V.X)T  =  H.(V.X)  =  (HV).X  =  (VH).X  =  V.(H.X)  =  V  XT 

The  rest  of  (e)  is  proved  in  a  similar  manner.  Part  (e)  provides  a  mechanism  by 
which  the  stochastic  integrals  introduced  in  Chapter  6  are  extended  to  larger 
classes  of  integrators  by  localization  and  “pasting”.  It  says  that  the  transform  of 
X  by  V  stopped  at  T  is  the  transform  of  X,  stopped  at  T.  by  V. 


The  proof  of  (f)  follows  from  similar  observations.  Again  set  Hn 
Then 


-  1  [T  >  r.)  • 

A[VT,XT]  =  A[H.V,H.X]  =  (AH.V)  (AH.X)  =  HAV  AX  =  A[V.X]T. 
since  by  its  definition  H~  =  H. 

Finally,  the  proof  of  Part  (g)  uses  the  characterization  of  compensators  given  in 
Section  1.7.2,  Part  (d)  and  the  fact  that  a  stopped  martingale  is  also  a  mar¬ 
tingale. 


1.10.  Discrete  Parameter  Point  Processes:  We  now  introduce  a  discrete 
parameter  stochastic  point  process  theory  which  parallels  the  continuous  parame¬ 
ter  point  process  work  done,  primarily  by  Bremaud,  from  1972  to  the  present. 
The  latter  material  considers  mainly  the  case  where  the  martingale  compensator 
of  the  continuous  parameter  point  process  is  absolutely  continuous  relative  to 
Lebesgue  measure;  most  applied  works  involving  martingale  techniques  treat  this 
case.  The  necessary  assumptions  for  the  discrete  parameter  analogues  of  these 
results  and  the  exact  form  of  their  conclusions  can  sometimes  be  deduced  directly 
from  this  continuous  parameter  case  and  sometimes  they  cannot.  In  either  case, 
discovering  the  correct  form  and  supplying  a  direct  proof  in  the  discrete  parame¬ 
ter  case  is  usually  quite  simple  (mathematically)  and  informative.  -As  far  as  I  can 
determine,  however,  such  an  approach  does  not  appear  explicitly  in  the  literature. 
The  basic  mathematical  foundation  for  the  discrete  case  resides  in  a  more  general 
part  of  the  theory  (random  measures)  than  point  processes  with  absolutely  con¬ 
tinuous  compensators  and  presents  an  unreasonable  technical  and  intuitive  hurdle 
for  most  applied  probabilists,  mathematicians  and  statisticians. 

The  only  paper  I  am  aware  of  that  suggests  the  importance  of  working  directly 
with  discrete  parameter  point  processes  is  by  T.  C.  Brown  [I983|.  Brown's  objec¬ 
tive  is  to  approximate  continuous  parameter  point  processes  by  the  discrete  case. 
One  of  his  results  says,  roughly,  that  a  large  class  of  continuous  parameter  point 
processes  can  be  approximated  arbitrarily  closely  over  intervals  of  random  length 
by  a  discrete  point  process.  In  a  later  BRL  report,  it  is  our  intent  to  use  some  of 
Brown’s  results  together  with  the  discrete  point  process  calculus  suggested  here 
and  the  limit  theory  developed  in  Aldous  [1981]  to  approximate  stochastic  net¬ 
work  models. 

1.10.1.  Definition:  An  F-adapted  process,  X  =  (Xn,(FJ),  where  Xn  :  Q  — ►  {  1.0} 
and  X0=0.  is  called  a  F-Discrete  Point  Process  (DPP).  \n  =  K(Xn  |  Fn  ,  ) 
is  called  the  F-intensity  of  the  DPP. 
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1.10.2.  Remarks:  (1)  Define  T0  =  0,  and  for  k  >  1,  kcZ+,  set 

T,.  :=  in f {  mZ  +  :  Xn  =  1.  n  >  TV,  }, 
if  {  •  •  }  V  0,  and  -t  oo  otherwise. 

Define  Nn  :=  V  ^k  T  hen  Nn  =  ^  l[Tk<„].  It  is  immediate  that  (Nn)  and 

k=0  k  > 1 

(Tn)  are  equivalent  representations  of  a  DPP.  (Xn).  Note  that  (Tk)  is  a  sequence 
of  F-stopping  times,  since  [Tk<n]  =  [Nn>k]cFn  for  all  k>l. 

n 

(2)  Set  An  =  V)Xk.  Then  \1  =  N  -  A  is  an  F-martingale.  The  F-predictable 
o 

process,  A,  is  the  martingale  compensator  of  N.  The  concept  of  martingale 
compensator  has  been  introduced  earlier  in  Section  1.7.2. 

The  proofs  of  the  following  statements  and  additional  results  will  appear  in  later 
BRL  Reports,  Andersen(I,II,1986j. 

Discrete  parameter  PP’s  are  of  interest  here  for  at  least  three  reasons:  first,  they 
present  an  insight  into  the  continuous  parameter  version  of  DPP,  second,  they 
are  applicable  to  time  slotted,  single  channel  communication  networks  (for  exam¬ 
ple,  packet  radio  networks)  and  third,  as  noted  in  the  reference  to  T.C.  Brown 
above,  they  can  be  used  to  approximate  continuous  parameter  point  processes. 

1.10.3.  Theorem:  (An  Exponential  Martingale  of  a  Point  Process) 

Let  A  =  fSn,  ¥n)  be  an  F  adapted  DPP  with  F-intensity  X,  and  define  the  pro¬ 
cess,  V  =  ( Yn ),  by  setting 


(  1  +  Xk(  ea  -  1  )  ) 
o 

for  all  real  a  and  mZ+  .  Then  )  is  an  F-martingale. 

1.10.1  Remark.  Assume  Xk  is  F0-measurable  for  all  k.  Then 


(10) 

< 


r:i 


IM=  II  M  +  Xkl  <■*  -  1 


m  +  1 
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We  will  call  a  process  satisfying  (17)  a  Doubly  Stochastic  Bernoulli  Process. 

Notice  that  in  this  case  Xk  =  E(Xk  |  F0  ),  for  all  k.  E.g.,  if  F0  =  cr( A),  where  A  is 
some  r.v.,  then  Xk  =  gk(A),  w’here  gk  is  an  F0-measureable  function  for  each  k.  As 
in  the  Poisson  case  (Bremaud  [1981]),  the  intensity  will  be  said  to  be  driven  by 
A. 

1.10.5.  Remark:  With  Y  as  in  the  statement  of  the  Theorem,  Y  satisfies 

AYn  =  Yn_i  ABn,  (18) 

n  exp(aXk) 

w-here  Bn  =  Y  (bk  -  1)  is  an  F-martingale  and  bk  =  - — - — - -. 

o  1  +  xk(exP(a)  “  1) 

Equation  (18)  is  an  analogue  of  the  continuous  parameter  differential  equation  dY 

=  YdB. 

1.10.6.  Remark:  The  proof  of  the  Theorem  is  almost  trivial  once  one  writes 
Yn  =  Yn_t  bk  and  notices  that  E(bk  j  F^)  =  1. 

1.10.7.  Remark:  Using  the  fact  that  e3^  =  Xea  4-  1  -  X  when  X  takes  only  the 
values  0  and  1,  it  is  easy  to  check  that 

bk  -  1  =  ,  (!  j!1-  .><Xk  -  X  k)  =  gk  Am, 

1  +  Xk(e  -  1) 

n 

where  m  is  the  compensated  martingale,  m  =  N  -  A,  with  An  =  Y\k  .  It  then 

l 

follows  immediately  from  equation  (18)  that  Y  satisfies  the  following  stochastic 
“integral”  equation  : 

Y„  =  1  +  ({g  Y  ).m)n, 

(ea  -  l) 

where  g..  =  - .  This  observation  is  a  special  case  of  a  result  due 

1  +  Xk(e3  -  1) 

to  Kabanov.  Liptser.  and  Shiryayev  [1983]  for  continuous  parameter  processes.  In 
this  sense  it  is  also  a  special  case  of  the  results  of  C.  Doleans-Dade  [1970]  and 
occurs  in  a  similar  form  in  P  Bremaud  [1981]  for  continuous  parameter  point 
processes  with  absolutely  continuous  (  relative  to  Lebesgue  measure  )  compensa- 


1.11.  Introduction  to  Non-linear  Filtering  of  Discrete  Point  Processes. 

Earlier  we  showed  (Doob’s  Decomposition)  that  an  integrable  (P),  F-adapted 
sequence  possessed  a  unique  representation  as  the  sum  of  an  F-martingale  and  a 
predictable  process.  If  (Fn)  is  an  observable  history,  and  X  is  F-adapted,  the  time 
evolution  of  X  is  observable.  Therefore,  Doob’s  result  says  that  observable 
processes  with  finite  mean  values  all  behave  as  semi-martingales.  As  noted 
earlier  this  is  a  very  general  and  far  reaching  theoretical  result  which  becomes  an 
important  result  for  applications  when  it  is  noticed  that  a  semi-martingale  that  is 
not  adapted  to  an  observable  history  can  be  projected  onto  an  history  of  observa¬ 
tion  in  such  a  way  that  its  image  is  a  semi-martingale,  with  signal  and  martingale 
parts  adapted  to  the  history  of  observation.  This  is  the  content  of  the  Projection 
Theorem  below. 

If  the  observed  history  is  generated  by  a  discrete  point  process,  then  the  mar¬ 
tingale  portion  of  this  projected  semi-martingale  has  an  integral  (transform) 
representation  in  terms  of  the  observed  point  process.  This  result  combined  with 
the  Projection  Theorem  leads  directly  to  nonlinear  filtering:  the  estimation  of 
functionals  of  an  unobservable  process  in  terms  of  their  projections  onto  an 
observable  point  process  history. 

1-12.  Integral  Representation,  Projection  and  Innovation. 

By  a  discrete  point  process  martingale  we  mean  a  martingale  which  is 
adapted  to  the  internal  history  of  a  discrete  point  process.  An  integral  represen¬ 
tation  of  such  a  martingale  plays  a  crucial  role  in  nonlinear  filtering  since  it 
guarantees  the  existence  of  the  "innovations  gain",  whose  computation  results  in 
the  construction  of  "filters". 

The  following  theorem  is  proved  in  Bremaud  [1981];  it  is  the  only  reference  he 
makes  to  discrete  PP's.  However,  there  is  a  huge  literature  regarding  the 
representation  of  continuous  parameter  point  processes.  We  mention  only  Boel. 
Varaiya  and  Wong  [1975],  Davis  [1976]  and  Chou,  Meyer  [1975], 

1  12  1  Theorem:  Integral  Representation  of  DPP  Martingales 

Let  N  =  (Nn,  Fn)  be  a  DPP  with  Fn  =  a{  Xk,  k  <  n  )  and  F-intensity  X.  Then, 
if  m  =  (mn,Fn)  is  an  F-martingale,  there  exists  an  F-predietable  process  II,  with 
E((  |  H  |  <M,M>)n)  <  oo,  for  all  n(Z+,  such  that  m  =  H.M  ,  where  M  =  .\  A. 

•^n  = 

o 

Because  nonlinear  filtering  has  its  origins  in  engineering,  we  will  follow  the  cus¬ 
tomary  terminology  of  that  field  and  refer  to  the  value  of  a  process  at  anv  time  n 
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as  the  "state"  of  the  dynamical  system  represented  by  that  process  at  time  n. 
We  have  the  following 

1.12.2.  Theorem:  (Projection  of  State). 

Let  Z  be  a  semi-martingale  adapted  to  a  filtration  F: 

Zn  =  4  +  !]  fk  +  mn> 

o 

where  E\ Z0|  <  oo  and 

(l)  m  =  (mn)  is  a  zero  mean ,  F-martingale 
(J)  f  =  (fn)  is  an  F-adapted  process  with  finite  mean 
(.])  O  =  (OJ,  a  filtration  with  On  contained  in 
Fn  for  all  n  and  O0  =  {0,  fl} 

Then  there  exists  a  zero  mean ,  O-martingale,  m  such  that 

Zn  =  E(  Zn  |  On  )  =  EZ0  +  Vfk  +  riin, 

o 

with  fk  =  E(fk  |  Ok  _  ,) 

1.12.3.  Remark:  In  the  continuous  parameter  case  f  must  be  taken  to  "progres¬ 
sively  measurable”  (Bremaud  [1981]). 

1.12.4.  We  now  consider  the  important  concept  of  innovations.  Innovations 
were  introduced  by  Kailath  for  Brownian  motion  processes  and  by 
Bremaud[1976,  1981]  for  the  continuous  parameter  point  processes.  In  our 
discrete  parameter  case,  the  following  simple  description  of  “innovations"  is 
rigorous.  This  type  of  argument,  not  the  concept  itself,  is  only  formal  in  continu¬ 
ous  time. 

1'sing  the  notation  Theorem  1.12.2,  suppose 

(1)  Let  On  =  cr(X0.X, . Xn),  then  On  is  contained  in 


Fn  ,  for  all  n  >  1. 


(2)  Set  Xn  =  E{Xn  |  On_  ,}  ,  where  Xn  is  the  Fn  intensity 

n  . 

of  Xn.  and  An  =  ][]  Xn.  Then  M  =  N  -  A  is  a  zero 

o 

mean  O-martingale. 

(3)  A\lk  =  ANk  -  AAk 

=  ANk  -  E(Xk  |  Ok  _,) 

=  ANk  -  E(ANk  |  0k  _  ,) 

=  Observed  -  Expected 
=  Innovative  Information. 

Therefore,  the  O-martingale,  M,  is  called  the  innovation 
process  associated  with  the  DPP  N. 

(4)  Using  the  DPP  representation,  the  state  projection  of 
Theorem  1.12.2  takes  the  form 

Zn  =  EZ0  +  £fk  +  (K.M)n  . 
o 

The  O-previsible  process,  K,  is  called  the  innovations  gain. 

After  the  following  statement  of  the  filtering  problem  we  will  show  how  to  expli¬ 
citly  determine  K. 

1.13.  The  Non-Linear  Filtering  Problem  for  Discrete  Point  Processes: 

We  can  now  summarize  the  state  equations  and  their  projections  by  the  following 
two  systems  of  stochastic  equations: 
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Zn  =  Z0  +  fk  +  mn  ;  in  is  an  F  martingale. 


k=o 


Nn  =  An  +  Mn  :  An  =  V]Xk  ;  M  is  an  F  martingale. 

o 


n  . 

Zn  =  EZ0  +  VTk  +  rhn  ;  m  =  K  .  M  is  an  O-martingale, 

o 

Nn  =  An  +  Mn  :  Mn  is  an  O-martingale. 


The  Problem:  On  the  basis  of  observations  on  Nn  construct  a  recursive  estima¬ 
tor  for 

Zn  =  E(Zn  |  On)  . 

All  that  remains  is  to  determine  K  from  the  fact  that  the  filtering  error  is  orthog¬ 
onal  to  the  flow  of  information  described  by  (On)  : 


E { ( Z n  -  Zn)(H  .  Nl)n}  =  0, 


(19) 


for  all  O-predictable  processes,  H. 

1.13.1.  An  Application  of  Discrete  Martingale  Calculus:  We  will  illustrate 
the  use  of  the  martingale  calculus  given  in  the  beginning  of  this  chapter  to  deter¬ 
mine  the  innovations  gain. 

Set  6  =  H  .  M.  Fn  =  V]fk  ,  and  Fn  =  Vfk  .  Assume  that  (Zn)  is  bounded. 

o  o 

Then,  using  integration  by  parts, 

=  (Z_  •  0)n  +  (0-  •  Z)n  +  •  0]n 

=  ((HZ.)  .  \l)n  +  (o.  .  (F  m))n  +  [F  +  m  ,  <?>]n 

=  ((HZ.)  .  M)n  +  ((H  .  ZJ  .  (A  -  A))n  +  (0.  .  F)n  +  (<t>  .  m)n  + 

+  (f .  0)  +  H  .  [m  ,  M). 
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r  .  r  s  .  - 


•C. 


-ivF. 


- >  wii a 


Making  a  similar  calculation  for  Zn<!>n,  obtain 
=  |Z  .  <?)„  -r  '<Y  ■  Z)n  +  [Z  ,  <?]n 
=  ((HZJ  .  \l)n  +  (<$_  .  (F  +  m))n  +  [F  +  m  ,  <£]n 
=  ((HZJ.M)n  +  (^.F)n  +  (^.m)n  + 

+  (f .  <f>)  +  KH  .  [M  ,  M]. 

Now,  taking  expectations  of  both  of  these  equations,  using  the  fact  that  the 
expectations  of  the  martingale  transforms  vanish,  and  using  equation  (19),  we 
obtain 

0  =  E{((HZJ  .  (A  -  A))n  +  (H(X  -  X)  .  (m  +  F)n} 

-  E { ( ( KH )  .  <M.MXn). 

It  follows  that  KJ1  -  XJ  =  u  -  ^2,i/  +  ^3,„  ^4,t/  where  the  processes 

(j  =  1  ,  2 , 3 , 4)  are  O-predietable  and  satisfy 

=  EVC^X 

E^^  l/Zj,  .  lX„  =  EMCy^OyXy, 

EVCVX.AZ,  =  EMC,*,,^, 

EVC^VAZ,  =  EtWtX, 

for  all  nonnegative,  O-predictable  processes,  C,  and  =  f„  +  Am„. 

1  13.2.  These  calculations  follow  most  of  the  work  in  this  area  (Bremaud  [1976. 
1981],  Davis  [1978],  and  others;  also  see  Yor  [1977]  and  Van  Schuppen  [1977]). 
The  formula  for  the  gain  given  here,  however,  is  slightly  different  from  those  of 
the  listed  sources  because  in  the  analogous  continuous  parameter,  absolutely  con¬ 
tinuous  compensator  set-up  AZS  =  Am_. 


Chapter  2.  Continuous  Parameter  Stochastic  Processes 


2.1.  Introduction:  Stopping  times  (Optional  times)  are  fundamental  to  the 
modern  theory  of  martingales.  They  bring  the  spirit  of  (plane)  geometry,  with  its 
attendant  intuitions,  to  the  study  of  these  processes  and  they  give  the  probabilist 
a  way  to  replace  the  continuum  with  the  countable.  The  development  that  we 
outline  here  is  pure  Claude  Dellacherie  [Capacites  et  processus  stochastiques, 
1972].  After  J.  Doob's  original  work,  this  is  the  next  monumental  work  on  stop¬ 
ping  times  and  associated  delineations  of  measurability.  The  introduction  of 
graphs  of  stopping  times  and  the  notions  of  “previsible",  “totally  inaccessible", 
and  “accessible"  stopping  times  allow  a  classification  of  stochastic  processes  that 
is  both  natural  and  necessary  for  the  productive  development  of  the  modern 
theory  of  semi-martingales,  their  applications  and  the  general  theory  of  stochastic 
processes. 

For  instance,  a  “previsible"  time  is  one  which  is  anticipated  by  the  previous 
occurrence  of  a  sequence  of  observable  events.  Accessible  times  are  those  whose 
graphs  consist  of  pieces  of  the  graphs  of  previsible  times.  Totally  inaccessible 
times  are  therefore  those  times  whose  graphs  are  disjoint  from  the  graphs  of  all 
previsible  times.  It  then  follows  that  the  graph  of  every  stopping  time  is  the 
union  of  the  graphs  of  accessible  and  totally  inaccessible  times. 

Optional,  accessible,  and  previsible  times  are  used  to  construct  “stochastic  inter¬ 
vals",  which  in  the  manner  of  Borel  are  used  to  generate  algebras  of  events  with 
properties  similar  to  those  of  the  generators.  Measurability  relative  to  these  alge¬ 
bras  is  then  used  to  single  out  various  classes  of  stochastic  processes  that  form 
the  building  blocks  of  a  stochastic  calculus  for  semi-martingales  which  at  the 
same  time  extends  the  classical  Ito  integral  from  Brownian  motion  to  semi¬ 
martingale  integrators  and  is  maximal  (cannot  be  extended  further)  in  an  intui¬ 
tive,  Cauchy  sense. 

These  algebras  also  lead  to  a  projection  theory  which  yields  a  generalization  of 
the  conditional  expectation  operator  for  processes,  and  of  the  “infinitesimal  gen¬ 
erator"  for  measures. 

The  material  in  the  following  chapters  is  based  primarily  on  Dellacherie  [1072], 
Meyer  [1973],  Dellacherie  and  Meyer  [1980],  M  ever  [1970],  Doleans-Dade  and 
Meyer  [1970],  Kunita  and  Watanabe  [1907],  Nletivier  [1982],  Liptser  and  Shir- 
yayev  [1977,1978],  Bremaud  [1981],  and  most  importantly,  the  Strasbourg 
Seminaires  in  Probability,  published  in  the  Springer- Verlag  "Lecture  Notes  in 
Mathematics"  from  1907  to  the  present. 
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Definitions  that  have  been  covered  in  the  discrete  parameter  case  and  carry  over 
with  little  change  will  be  treated  formally  here.  An  attempt  will  be  made  to  give 
some  insight  into  others  and  compare  some  with  the  discrete  case  in  the  hopes  of 
understanding  each  a  little  better. 

2.2.  Filtrations:  Let  (Q,H,P)  be  a  probability  space.  A  family  of  sub-sigma 
algebras,  F  :=  (F(t),t>0),  of  H  is  said  to  be  a  filtration  on  (Q,H),  iff 

(i)  F(s)  C  F(t),  when  s<t. 

If,  in  addition, 

(ii)  F(0)  contains  all  P-nuIl  sets,  and 

(iii)  F(t-f)  (:=  p|  F(t+h)  )  =  F(t),  for  all  t>0, 

h>0 

then  the  filtration,  F,  is  said  to  satisfy  the  usual  conditions  (Dellacherie,  1972). 
In  this  case,  and  with  H  =  <x(  (J  F(t))  :=  F(oo),  the  structure  (Q,H,F,P)  is  called 

t>0 

a  filtered  probability  space  satisfying  the  usual  conditions.  Finally,  we  note 
that  if  F  satisfies  (iii),  F  is  said  to  be  right  continuous.  The  first  and  second 
conditions  guarantee  that  each  F(s)  is  complete.  (As  a  reminder,  a  subset  B  of 
H  is  a  P-null  set  if  there  exists  an  event  A  in  H  such  that  B  C  A  and  P(A)=0.) 

In  addition  to  the  cr-algebra,  F(t+),  we  define  F(t-)  :=  er(  p  F(s)).  In  general, 

s<t 

theses  algebras  of  events  satisfy  F(t-)  C  F(t)  C  F(t-f),  for  all  tc[0,oo].  F(t-) 
can  be  thought  of  as  representing  the  history  of  observation  “prior”  to  time  t. 

2.3.  Stochastic  Processes:  A  stochastic  process  is  a  mapping  X.[0.oc)xn 
— »R  such  that,  for  each  t  >  0,  the  mapping  w— >X(t,w)  is  H-measurable.  (H- 
measurable  means  that  the  inverse  image,  Xt  *(B)  =  {  w  :  X(t,w)cB  },  under  X(t) 
of  real  Borel  set  B,  is  contained  in  H.)  More  directly,  in  terms  of  familiar  con¬ 
cepts,  a  stochastic  process  is  a  family  of  random  variables,  r.v.'s.  indexed  by  t>0. 

2.3.1.  The  trajectories  (  paths  )  of  a  stochastic  process,  X.  are  the  mappings 
t— *X(t,w),  indexed  by  w  in  fl.  Regularity  properties  attributed  to  a  process.  X. 
such  as  continuity  or  right  continuity  or  left  limits  refer  to  the  trajectories. 


and  will  be  said  to  hold  almost  surely,  relative  to  P  (a.s.P),  if  the  set  of  all 
w  f  H  for  which  the  property  holds  has  P-measure  equal  to  one.  For  example,  X  is 
continuous  (a.s.P),  if  P({wefi  :  (tcR+)— »X(t,w)  is  continuous})  =  1;  that  is,  if. 
relative  to  P,  almost  all  trajectories  are  continuous  functions  on  R+.  After  a 
while  the  qualifier,  (a.s.P.),  will  be  taken  to  be  understood  and  will  only  be  men¬ 
tioned  occasionally. 

However,  even  with  all  this  explanation,  statements  about  such  things  can  be  a 
little  obscure;  for  example,  a  process  which  is  a.s.P  continuous  at  each  t  is  not 
necessarily  a.s.P  continuous!  Such  an  example  is  given  below  after  Lemma  1. 

2.3.2.  Two  processes,  X  and  Y,  are  said  to  be  modifications  of  each  other  if 

P(  wcQ:  X(t,w)  =  Y(t,w)  )  =  1, 
for  each  t>0.  More  strongly,  if 

P(  wcfi:  X(t,w)  =  Y(t,w),  for  all  t>0  )  =  1, 
the  two  processes  are  called  indistinguishable. 

Thus,  two  processes  are  indistinguishable  if  their  paths  coincide  a.s.P.  As  in  the 
discrete  case,  indistinguishability  establishes  an  equivalence  relation  on  the  set  of 
processes  on  the  common  probability  space  (fi,  H,  P)  indexed  by  R+.  In  this 
sense  we  identify  all  indistinguishable  processes.  A  process  which  is  indistinguish¬ 
able  from  the  process  that  is  identically  zero  is  said  to  be  evanescent.  A  subset 
B  of  [0,oc)Xfi  is  called  a  random  set.  Random  sets  are  said  to  be  evanescent  if 
their  indicator  functions  are  evanescent  processes.  Equivalently,  a  random  set,  B, 
is  evanescent  if  its  projection  into  0  is  a  P-null  set.  In  the  language  of  random 
sets  two  processes  X  and  Y  are  indistinguishable  iff  the  random  set 
{  X  7^  Y  }  :=  {  (t,w)  :  X(t,w)  ^  Y(t,w),  t  >  0,  w  c  fl  }  is  evanescent. 

Clearly,  if  X  and  Y  are  indistinguishable  and  X  has  continuous  (a.s.P)  paths,  then 
Y  also  has  continuous  paths.  If  X  and  Y  are  modifications,  unlike  the  discrete 
case,  one  cannot  claim  indistinguishability.  However,  we  have  the  following  (Del- 
lacherie,  1972) 

2.3.3.  Lemma:  If  X  is  a  modification  of  V  and  these  processes  are  right  continu¬ 
ous,  then  they  are  indistinguishable. 


m 


2.3.4.  Remark:  Just  use  the  modification  property  on  the  rationals,  a  countable 
set.  whose  union  is  P-null  and,  by  right  continuity,  contains  (X(t)^Y(t))  for  all 
t>0. 

This  Lemma  is  the  first  hint  that  path  regularity  in  the  form  of  right  continuous 
paths  will  be  an  important  assumption  in  this  note. 

2.3.5.  Remark:  We  now  give  an  example  of  two  processes  that  are  modifications 
of  one  another,  but  are  not  indistinguishable,  one  of  them  not  being  right  con¬ 
tinuous.  Let  H  =  R+,  H  be  the  real  Borel  sets  of  R+,  and  P  the  probability 
measure  induced  on  H  by  the  standard  exponential  distribution.  Let  X  be  the 
diagonal  process,  X(t,w)  equal  to  1  on  the  diagonal  of  R+Xfl  and  equal  to  0  else¬ 
where.  Set  Y  equal  to  0  on  R+xH.  Then  X  is  a  modification  of  Y,  since 
P([X(t)  7^  Y(t)j)  =  P( { t } }  =  0  for  each  t  in  R+.  To  see  that  X  is  not  right 
continuous,  just  note  that  the  set  of  X  trajectories  which  are  not  right  (or  left) 
continuous  has  P-measure  1:  P({w:w  =  t.  for  all  t  >  0})  =  P(R  +  )  =  1. 
Since  this  is  the  same  as  P({w:X(t.w)  ^  0  for  all  t  >  0} )  =  1  the  two 
processes  X  and  Y  are  certainly  not  indistinguisable. 

What  if  we  replace  X  by  Z,  where  Z  is  one  on  the  diagonal  of  R+Xft  only  when 
the  coordinates  are  rational  numbers,  and  otherwise  Z  is  zero?  With  the  same  P. 
Z  is  again  a  modification  of  Y,  but  this  time  Z  is  a.s.P  right  continuous  and  so 
indistinguishable  from  Y. 

2.3.6.  We  should  point  out  that  even  though  we  have  assumed  that  our  processes 
are  real  valued,  we  could  have  been  more  abstract  and  taken  the  state  space  of 
the  processes  to  be  some  measurable  space  (E,B(E)),  where  B(E)  is  the  Borel 
sigma-algebra  generated  by  the  open  sets  in  E.  Much  of  what  we  will  talk  about 
here  still  holds  in  this  more  general  case  with  a  few  qualifiers.  For  example,  in  the 
previous  Lemma  we  would  have  had  to  assume  that  E  is  separable. 

2.3.7.  A  stochastic  process,  X,  is  said  to  be  adapted  to  the  filtration  F,  or  F(t)- 
adapted,  if  the  mapping,  w— *X(t,w),  is  F(t)-measurable  for  each  t>0.  Histori¬ 
cally,  adapted  processes  were  said  to  be  nonanticipating.  A  process  X  is  always 
adapted  to  Fx-(t)  ,  the  filtration  generated  by  X,  Fx(t)  :=  cr(  X(s),  0<s<t  ) 

Clearly,  under  the  “usual  conditions”,  modifications  of  adapted  processes 
are  adapted. 

In  applications,  when  F(t)  is  interpreted  as  a  history  of  the  evolution  of  a  collec¬ 
tion  of  processes,  an  F-adapted  process  will  be  said  to  be  observable  relative  to 


these  processes. 


2.3.8  It  is  important  to  realize  that  classical  theories  of  martingales.  Markov 
processes  and  stopping  times  were  only  concerned  with  internal  histories.  The 
modern  theory  on  the  other  hand  just  assumes  that  there  is  a  single  filtration, 
a  reference  family,  relative  to  which  all  processes  are  adapted.  Stopping  times 
are  defined  relative  to  this  filtration  and  used  to  characterize  several  rr-algebras  of 
random  events.  In  the  modern  setting  of  a  single  filtration,  applications  generally 
involve  several  partially  ordered  families  of  filtrations.  For  example,  in  the  non¬ 
linear  filtering  encountered  in  Chapter  i,  we  have  the  filtrations  corresponding  to 
the  state  and  observation  processes  with  the  “state  filtration''  containing  the 
observation  filtration. 

2.3.9.  There  are  several  additional  types  of  measurability  that  are  necessary  for 
the  calculus  of  martingales  with  a  continuous  parameter  (time)  set  that  cannot  be 
discerned  in  the  discrete  parameter  case.  For  the  moment,  we  only  introduce 
measurability  relative  to  the  product  spaces  B([0,oc))xF(oo)  and  B( [0, t] )  X F(t ): 

A  process  is  said  to  be  measurable,  if  the  mapping  X:[0,oo)xn  — »R  is 
B([0.oc))xH-measurable  (i.e.,  measurable  as  a  function  of  two  variables).  In 
most  cases  we  will  consider  processes  which  are  both  measurable  and  adapted. 
That  is,  a  measurable  mapping  of  ([0,oo)Xft,  B([0,oo)xH)  into  (R,  B(R))  such 
that  for  each  fixed  t,  w— *X(t,w)  is  F(t)*measurable. 

Notice  that  when  [0,  ooj  is  replaced  by  Z+,  as  in  the  discrete  parameter  case, 
every  process  is  measurable;  in  the  first  chapter  adapted  processes  corresponded 
to  adapted  and  measurable  processes. 

2.3.10.  By  restricting  the  notion  of  measurability  to  the  time  interval  [0,t],  we 
obtain  measurability  relative  to  the  filtration,  or  progressive  measurabil¬ 
ity  relative  to  (F(t),t>0)  :  X  is  said  to  be  F-progressive  if  the  mapping 
(s.w ) — *  X(s,w),  restricted  to  [0,t]xfi,  is  (B[0,t)  xF(t))-measurable,  where  B[0,t]  is 
the  Borel  <r-algebra  of  [0, t] .  Random  sets  are  called  progressive  if  their  indicator 
processes  are  progressive. 

Clearly,  if  X  is  progressive  then  it  is  adapted  and  measurable.  The  example 
given  after  the  following  Lemma  shows  that  X  can  be  adapted  without  being  pro¬ 
gressive.  Dellacherie  and  Meyer[1975,  IV  T 1 5]  show 

2.3.11.  Lemma:  If  X  is  adapted  and  right  continuous  (left  continuous ).  then  X  is 
progressive. 


2.3. 1‘2.  Remark:  Let  X.  Y  be  the  processes  given  in  the  example  after  Lemma 
2.3.3.  Take  the  same  probability  space  as  in  that  example  and  define  the  filtra¬ 
tion.  F  ==  ( F ( t ) ) .  by  letting  F ( t )  be  the  cr-algebra  generated  by  the  family.  {  |s|  : 
s<t  }.  Then  the  diagonal  process,  X,  is  F-adapted.  since 
[X( t )  =  1]  =  {t}fF(t),  but  X  is  not  F-progressive  since  {X  =  1 )  equals  the  rec¬ 
tangle  [O.t]  X  [O.t]  and  this  does  not  belong  to  B([0,t])xF(t)  because  [O.t]  does  not 
belong  to  F(t),  which  contains  only  countable  sets.  We  have  already  noted  that 
X  is  not  right  continuous. 

One  of  the  consequences  of  the  "usual  conditions”  is  that  every  martingale  has  a 
modification  that  is  right  continuous  and  has  left  limits  (  at  each  point  of  a 
path,  a.s.P  ).  Processes  which  are  right  continuous  and  have  left  limits,  are 
sometimes  called  cadlag,  or  rcll  ;  the  French  abbreviation,  “cadlag”  stands  for 
"continu  a'  droite,  limites  a‘  gauche”.  Recently,  some  authors  have  begun  refer¬ 
ring  to  such  processes  as  Skorokhod  processes,  after  Russian  mathematician 
A.N.  Skorokhod  [1956].  We  will  use  the  last  descriptor.  The  full  importance  of 
the  Skorokhod  assumption  will  begin  to  emerge  in  Section  2.8.  Essentially,  all 
the  processes  considered  in  Chapters  5  and  6  will  be  taken  to  be  Skorokhod. 

2.4.  Stopping  Times:  Often  in  probability  we  are  interested  in  the  time,  T,  at 
which  a  certain  random  phenomenon  associated  with  a  stochastic  process,  X. 
occurs.  E.g.,  the  first  time,  T(w),  that  the  path,  t  — ►  X(t,w),  hits  a  particular 
level.  In  fact,  if  F(t),  of  the  filtration  F=(F(t)),  is  interpreted  as  the  collection  of 
all  events  associated  with  the  evolution  of  a  process,  X,  during  the  time  interval 
[0,t],  we  can  make  precise  the  statement  that  this  phenomenon  occurred  before 
time  t  by  requiring  that  [T<tj  :=  {w|T(w)<t}  belong  to  F(t),  for  every  t>0. 

2.4.1.  Definition:  A  positive  r.v.  T,  finite  or  not,  is  called  a  stopping  time  (or 
optional  time)  relative  to  the  filtration  F=(F(t),t>0),  if  the  event  [T  <  t]  t 
F(t),  for  each  t>0.  (Note:  “positive”  is  meant  in  the  sense  of  nonnegative.) 

In  Chapter  1  we  saw  that  for  non-negative,  integer  valued  G-stopping  times, 
[T  —  n]  (  Gn  iff  [T  <  n]  c  Gn  iff  [T  <  n]  c  G^.  Here  the  situation  is  a  little 
different.  To  appreciate  the  difference,  let  T  be  an  F-stopping  time.  Then 

[T  <  t]  =  (  (J  [T  <  t  -  t]  )  £  F(t), 

f  >  o 

since  [T  <  t  -  t]cF(t  -  f)CF(t),  for  t>0,  by  monotonicitv  of  filtrations.  There¬ 
fore,  if  T  is  an  F-stopping  time,  [T  <  t]f F'( t)  and  then  so  does  [T>t],  for  all 


t>0.  But  if  all  that  we  know  about  a  mapping  T:fi— »R+  is  that  [T  <  tj(F(t)  for 
all  t>0,  then  all  we  can  conclude  is  that 

(  I  <  t]cF(t+)  :=  p|  [T  <  t  +  h], 

h  >0 

so  T  just  an  F(t  +  )  stopping  time.  Therefore,  if  the  filtration  is  right  continuous 
then  [T<t]fF(t)  for  all  t>0,  implies  that  T  is  an  F-stopping  time.  Thus,  under 
the  "usual  conditions”  the  two  conditions  are  equivalent.  As  noted  already  we 
will  generally  assume  that  our  filiations  are  right  continuous. 

Any  nonnegative  constant  is  a  stopping  time  relative  to  any  filtration.  For  exam¬ 
ple,  if  T(w)  =  c  ,  for  all  wffi.  then  [T  <  t]  =  0  when  c<t  and  =  <*>,  otherwise. 
If  T  is  a  stopping  time  and  c  is  a  nonnegative  real  number,  then  c  +  T,  is  also  a 
stopping  time:  [T  +  c  <  t|  =  [T  <  t  -  c] c F( t-c )  C  F(t),  for  all  t>0. 

There  are  numerous  interesting  simple  results  concerning  stopping  times  that  are 
needed  to  develop  an  intuition  about  them,  but  covering  them  is  beyond  the 
scope  of  this  short  note.  Probably  the  best  treatments  are  given  by  Del- 
lacherief  1972)  and  Metivier)  1982).  We  will  try  to  introduce  only  what  will  be 
needed  to  provide  a  reasonable  understanding  of  “previsibility”  and  its  role  in 
the  theory  of  martingales  and  stochastic  integration. 

2.1.2.  We  observe  in  passing  then  that  the  minimum  and  maximum  of  two  F- 
stopping  times  are  again  F-stopping  times.  Also,  the  supremum  of  a  sequence  of 
F-stopping  times  is  an  F-stopping  time: 

OO 

[  sup{Tn  :  n>0  }  <t  ]  =  (  f)  [  Tn  <  t  ])cF(t). 

n  =  1 

The  infimum,  S,  of  a  sequence  of  F-stopping  times  is,  however,  an  F(t-f)-stopping 
time.  That  is,  we  can  only  claim  [S<tj(F(t+),  for  all  t>0: 

CO  CO  1  OO  1 

[s<tj  =  niuiTn<t+r]>fnF(t+-)  =  F(t+)- 

k=l  n=>  1  K  j  =  l  J 

But  again,  since  we  assume  the  “usual  conditions”,  S  is  also  an  F-stopping  time. 
Hence,  the  limsup  and  liminf  of  a  sequence  of  F-stopping  times  are  F-stopping 
times.  Therefore,  whenever  the  limit  of  a  sequence  of  stopping  times  exists,  the 
limit  is  a  stopping  time.  Another  simple  fact  is  that  the  sum  of  any  two  F- 


stopping  times  is  again  an  F-stopping  time. 

2.4.3.  In  the  realm  of  sophisticated  stopping  times  we  mention  the  "hitting'' 
time  or  debut  of  a  random  set,  A.  defined  as  DA  (w)  :=  in f {  tcR+  :  (t.w)f  A  ).  or 
as  Da(w)  :=  +oo,  if  the  section  A(w)  =  {  t  :  (t,w)cA  }  is  empty.  Dellacherie 
[1972]  used  capacity  the.,(y  to  show  that  when  the  filtration  satisfies  the  usual 
conditions  and  A  is  an  E-progressive  random  set  then  the  debut  of  A  is  an  F- 
stopping  time.  We  will  return  to  this  example  later  in  the  chapter  where  we  will 
introduce  “k-debuts".  For  this  purpose,  notice  that  we  can  write 

Da(w)  =  inf { to  R+  :  [0,t]p|A(w)  contains  at  least  one  element  }. 

2.1.4.  Definition:  Given  an  F-stopping  time,  the  family  of  events  which  occur 
prior  to  T,  denoted  F(T),  is  defined  as  the  set  of  all  events  AcF(oc):  = 

<r(  pjF(t)),  for  which  Apj[T  <  t]  cF(t),  for  each  t>0. 

t>0 

2.4.5.  If  T  is  a.s.P  equal  to  a  constant  time,  t,  then  F(T)  =  F(t).  This  justifies 
the  notation  F(T)  when  T  is  a  stopping  time.  Further,  it  is  easy  to  verify  that 
F(T)  is  a  sigma-algebra  and  T  is  F(T)-measurable.  (For  the  latter,  just  observe 
[T  <  t]  =  A  Pi  [T  <  t]  «  F(t),  for  all  t>0,  where  A  =  [T<t],  so  AcF(T),  and 
consequently,  T  is  F(T)-measurable.) 

These  <r-algebras  are  monotone  at  stopping  times,  in  the  sense  of  the  next 
theorem. 

2.4.6.  Theorem: 

Let  S  and  T  be  F-stopping  times.  //S<T,  a.s.P ,  then  F(S)CF(T). 

Remark:  S<T  implies  [T<t.]C[S<t],  so  that  for  any  AeF(S) 

An[T<t]  =  Ap|(S<t.]p|[T<t]fF(t). 

Therefore,  AcF(T). 

Remark:  The  following  are  just  as  easy  to  prove: 
o  AeF(S)  implies  Ap|[S<T]cF(T) 
o  F(min(S,T))  =  F(S)f|F(T) 

o  [S  <  T],  [S  >  T]  and  [S  =  T]  are  in  F(S)  and  F(T) 
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If  S  is  a  positive  r.v.  which  is  measurable  relative  to  F(T),  then  S  is  not  neces¬ 
sarily  an  F-stopping  time.  A  sufficient  condition  is  that  S>T.  Since  this  is  sim¬ 
ple  and  important,  it  is  worth  a  proof.  Just  use  [T < t)  to  partition  the  sample 
space,  Q.  Then  since  S>T,  [S  <  t]  =  [S  <  t]  p|  [T  <  t]  and,  because  S  is  F ( T )- 
measurable,  the  right  side  of  this  equation  is  in  F(t)  for  all  t>0.  Hence, 
[S < t]tF( t)  for  all  t>0,  so  that  S  is  a  stopping  time. 

This  has  the  consequence  that  every  F-stopping  time,  T,  can  be  written  as  the 
limit  of  a  decreasing  sequence  of  F-stopping  times,  each  taking  a  countable 
number  of  values:  Just  define  T(n)  by  setting  N  =  N(n)  =  2n  and 

T|n)  :=  NN  1|k'’  s  N|")T<11, 

when  T  is  finite,  and  oo,  otherwise.  Then  T(n)>T  and  the  previous  result 
applies,  making  T(n)  a  stopping  time.  Also,  on  {w:T(w)<oo}  =  [T<oc],  we  have 
0<(T(n.w)  -  T(w))  <  1/N(n);  hence  T(n,w)  — ►  T(w),  as  n  — ♦  oo,  for  every  w  in 
n.  Finally,  if  T(n,w)  =  k/N(n),  then  we  must  either  have 
T(n+l,w)  =  (2k-l)/N(n+l)  <  k/N(n),  or  T(n+l,w)  =  k/N(n).  So. 
T(n,w)  >  T(n+l,w),  for  all  w  c  U. 

Notice  that  without  the  “countable  valued”  requirement,  it  is  obvious  that  the 

sequence  Sn  =  T  —  decreases  to  T  a.s.P  on  [T<oo]. 

n 

Observe  carefully  that  one  cannot  make  a  symmetric  statement  relative  to 
increasing  sequences  of  stopping  times.  In  the  next  section  we  will  see  that 
requiring  this  symmetry  leads  to  the  notion  of  “previsible  stopping  times”. 

2.5.  Stochastic  Intervals:  Let  S  and  T,  with  S  <  T,  be  two  F-stopping  times 
and  set 

[[S,T))  :=  {  (t,w)  |  S(w)<t<T(w),  0<t<oo,  weft  }. 

[[S,T))  is  called  a  stochastic  interval;  if  we  want  to  emphasize  the  underlying 
filtration,  we  will  write  F-stochastic  interval.  Stochastic  intervals  [ [S ,T] ] , 
( ( S,T ) ),  and  so  on,  are  defined  in  the  same  manner.  If  S=T,  then  [[T]]  :=  [[T.TJ] 
is  called  the  graph  of  T. 

2.5.1.  F-stochastic  intervals  are  F-progressivc  random  sets.  That  is.  the  indica¬ 
tor  function  of  an  F-stochastic  interval  is  an  F-progressive  process  on  R  +  x!l: 


is 


ftV. 


First  note  that  (s.w)  —  lj[s,T))fs-w)  ‘s  F-adapted  since  1  [[s,T))  (s-w)  =  F 
S(w)<s<T(w),  0  otherwise,  and  |S<s]pj[T>s]cF(s).  Since  1  [[s,T)>  has  right  con¬ 
tinuous  paths  by  inspection.  Lemma  2.3.11  applies.  Similarly,  ( ( S . T j ]  has  an  F- 
progressive  indicator  function.  The  other  types  of  stochastic  intervals  are  han¬ 
dled  in  the  same  way  or  as  combinations  of  stochastic  intervals  whose  regularity 

properties  are  known.  E.g..  ( ( S.T) )  =  ((S,T]]p|[[S,T)). 


2.5.2.  We  can  now  show  the  converse  of  the  result  guaranteeing  that  a  debut  is  a 
stopping  time.  That  is,  every  stopping  time,  T,  is  the  debut  of  a  progressive  ran¬ 
dom  set:  Just  set  A  =  ([T.oo)).  Then  A  is  progressive  and  the  statement  follows 
by  noting  that  [DA  <  t]  =  [T  <  t].  Also  note  that  if  A  =  [[S.T)),  then 
Da  =  S  on  [S  <  T]  and  =  oc,  on  [S  =  T], 

When  A  is  of  the  form  A  =  {(t.w):  X(t.w)tB},  where  X  is  some  stochastic  pro¬ 
cess,  the  debut  of  A  is  called  the  hitting  time  or  the  first  entrance  time  of  X 
into  B.  By  what  has  been  said,  if  X  is  progressively  measurable  and  B  is  a  real 
Borel  set.  then  the  debut  of  A  is  a  stopping  time.  The  best  discussion  of  this  is 
given  in  Williams  [1979]. 

2.5.3.  The  following  example  will  be  used  later  on  as  an  example  of  a  stopping 
time  which  is  not  a  previsible  time.  (It  is  an  exercise  in  Metivier  [1982]).  We 
will  specialize  it  somewhat  in  order  to  have  a  simple  example  to  illustrate  the 
graph  of  a  stopping  time.  Let  A  be  a  nonempty,  proper  subset  of  the  interval 
[0.1]  :=  0.  Set  F(0)  =  {0,  Q\  and  F(l)  =  {A,  Ac,  0.  Q}.  Define  the  filtration 
F(t)  :=  F(0),  if  t([0,  1)  and  :=  F(i),  if  t>l.  Then  (F(t).  t>0)  is  a  right  continu¬ 
ous  filtration.  Set  T  :=  1  +  1A.  Then  T  is  an  F-stopping  time: 


[T  <  t] 


0  if  0<t  <  1 
Ar  1  <  t  <  2 

n  2<t 


so  that  [T  <  tjcF(t),  for  all  t,  >  0.  If  we  take  the  usual  two  dimensional  coordi¬ 
nate  system  with  time  (the  range  of  T)  as  the  horizontal  axis  and  D  as  the  inter¬ 
val  [0.1]  on  the  vertical  axis,  then  with  A  =  [0, .5]  the  graph,  [[T]].  is  the  follow¬ 
ing  union  of  straight  line  segments: 

[[T]]  =  {( l,w):wc(  .5, 1]  }(j{(2,w  ):wt  [0,5] }. 
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2.5.4.  Definition:  The  family  of  events  which  occur  strictly  prior  to  the 

stopping  time,  T.  is  denoted  by  F(T-)  and  is  defined  as  the  sigma-algebra  gen¬ 
erated  by  F(0)  and  events  of  the  form  Apj[T  >  t]  for  all  ArF(t)  and  all  t>0. 


As  with  F(T),  T  is  F(T-)-measurable.  Also,  since  the  generators  of  F(T-)  belong 
to  F(T),  F(T-)  C  F(T).  Also,  if  S  is  a  stopping  time  with  S<T  then  F(S-)  C 
F(T-). 


2.5.5.  Remark:  It  is  important  to  note  that  left  continuity  of  F  does  not  imply 
that  F(T)  =  F(T-).  For  example,  let  0  :=  [0,  oc),  and  B( [a,b ))  be  the  Borel  a- 
algebra  of  subsets  of  the  interval  [a,  b).  Set  F(t)  :=  (B([0,t]))|^j{(t,oc),  R+]  for 
all  t>0  and  note  that  F(t-)  =  F(t)  =  F(t+)  for  all  non-negative  t.  Setting  T(w) 
=  w  on  n  defines  T  to  be  an  F-stopping  time  with  F(T)  7^  F(T-).  This  is  an 
exercise  in  Metivier  [1982],  However,  this  is  not  meant  to  imply  that  the 
mathematical  setup  is  simple.  This  setup,  or  a  slight  variation,  is  at  the  heart  of 
numerous  papers  (e.g.  Dellacherie  (1970).  Chou  and  Meyer  [1975]  and  finally,  with 
corrections,  in  Chapter  4  of  Dellacherie  and  Meyer  [1975]). 

For  example,  in  the  last  reference,  a  filtration.  G.  is  taken  to  be 

G(t)  =  <t(  B{  (0,t)  },  [t,  oc)  ) 

for  all  tc[0,  oc].  Then  G(t  +  )  contains  {  t  }  and  (t,  oc),  and  these  sets  are  not  in 
G(t).  Therefore,  in  this  case  G(t)  7^  G(t-l-)  and  G  is  not  right  continuous.  It 
follows  that  T,  the  identity  mapping  as  defined  at  the  beginning  of  this  remark, 
is  a  G(t  +  ),  but  not  a  G(t),  stopping  time.  We  will  return  to  this  example  at  the 
end  of  the  next  section  to  illustrate  the  special  classes  of  stopping  times  intro¬ 
duced  there. 


2.6.  Previsible,  Accessible,  Optional  Times:  Recall  again  that,  unless  stated 
otherwise,  we  assume  that  the  “usual  conditions’’  hold  on  the  underlying  filt ra¬ 
tions. 

Earlier  when  we  were  approximating  stopping  times  from  above,  we  pointed  out 
that  they  cannot  in  general  be  approximated  from  below  by  increasing  sequences 
of  stopping  times.  However,  from  the  standpoint  of  the  calculus  of  martingales, 
th  ose  stopping  times  that  do  have  this  property  can  be  used  to  characterize  the 
most  important  class  of  measurable  processes.  Dellacherie  and  Moyer(1980)  point 
out  that  processes  with  this  type  of  measurability  (previsibility)  play  the  same 
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role  in  stochastic  integration  as  Borel  functions  play  in  the  classical  theories  of 
measure  and  integration.  The  construction  of  this  class  begins  with  the  following 
definition. 

2.6.1.  Definition:  An  F-stopping  time,  T.  is  said  to  be  F-previsible  (predict¬ 
able)  if  there  exists  a  sequence  (T(n))  of  F-stopping  times  with  the  following  pro¬ 
perties: 

(i)  T(n,w)  <  T(w),  a.s.P,  on  [T>0j.  n>0; 

(ii)  ( T ( n ) )  is  increasing  (a.s.P)  and  converges  (a.s.P)  to  T. 

Note:  Generally,  when  there  is  no  possibility  of  confusion,  we  will  drop  reference 
to  the  underlying  filtration,  F. 

The  sequence  of  stopping  times,  (T(n)),  is  said  to  announce  T,  and  is  called  an 
announcing  sequence  for  T.  Clearly,  if  T  is  a  stopping  time  and  c  is  a  posi¬ 
tive  real  number,  then  T+e  is  previsible.  Just  take  T(n)  :=  T  +  o(  1  -  (1/n)). 
n>0. 

Intuitively,  if  T  is  the  first  time  an  event  can  happen,  then  T  is  previsible  if  we 
are  aware  that  the  event  is  about  to  happen;  a  sequence  of  events  takes  place 
that  foretell  the  occurrence  of  T.  As  a  matter  of  fact,  the  announcing  sequence  is 
also  called  a  foretelling  sequence. 

The  traditional  example  of  a  previsible  stopping  time  is  first  time  that  an 
adapted,  continuous  (  hence  progressive  )  process,  X,  (  X(0,w)  —  0  )  hits  a  single- 
ton  set:  For  example, 

T(w)  :=  inf {  t  :  X(t,w)  =  1  } 

and  :=  oc,  when  {...}  =  <5.  For  definiteness,  take  X  to  be  standard  Brownian 
motion.  To  see  that  T  is  previsible,  just  take  Tn,  an  announcing  sequence  of  T. 

to  be  T„(w)  :=  inf {  t  :  X(t.w)  =  1  -  —  }. 

n 

A  famous  non-previsible  stopping  time,  T.  is  the  “time  to  the  kth  event"  of  a 
Poisson  process.  The  standard  proof  of  this  fact  can  be  found  in  Liptser  and  Shir- 
vayev  [Yol  II].  We  will  give  a  simpler  but  more  sophisticated  demonstration  by 
Aldos  [1681]  that  also  yields  a  result  useful  later  in  this  chapter.  Let  N  =(N(t). 
t>0)  be  a  Poisson  process  with  parameter  // 1  at  time  t.  Then  s— *N(s+t)  N(t) 


defines  a  Poisson  process  with  parameter  /t s  and  so  by  the  Strong  Markov  Pro¬ 
perty.  N(s+T)  -  N(T)  is  again  Poisson,  // s.  Now,  if  we  assume  that  T  is  previsi- 
ble,  then  T+s  is  previsible  and  announcing  sequences  exist  for  both.  Evaluating 
the  Poisson  increments  at  these  announcing  sequences  and  passing  to  the  limit, 
we  have  that  N((s+T)-)  -  N(T-)  is  Poisson,  /ts.  Remembering  that  N  has  right 
continuous  paths  and  letting  s  — ♦  0+,  we  obtain  N(T)  -  N(T-)  =  0,  a.s.P.  This 
states  that  T  is  not  a  jump  time  of  the  process  as  originally  supposed.  Therefore. 
T  cannot  be  previsible. 

2.6.2.  We  now  introduce  a  stopping  time  which  is  (a.s.P)  never  equal  to  any 
previsible  time,  appropriately,  it  will  be  called  a  totally  inaccessible  time.  The 
time  to  the  first  jump  (event)  of  a  Poisson  process  is  such  a  time.  The  “comple¬ 
ment”  of  a  totally  inaccessible  time  will  be  said  to  be  accessible.  More  formally, 
we  give  the  following: 

2.6.3.  Definition:  Let  T  be  an  F-stopping  time.  Then 

(i)  T  is  said  to  be  accessible  if  there  exists  a  sequence  of 
previsible  times,  (T(n)),  with  the  property 

u  (1T(  n)]]  D  [[T]],  up  to  an  evanescent  event. 

n>0 

(ii)  T  is  said  to  be  totally  inaccessible,  if  the  intersection. 
[[T]]p|[[S]],  is  empty,  up  to  an  evanescent  event,  for  each 

previsible  stopping  time,  S. 

That  is,  the  graph  of  an  accessible  time.  T.  is  made-up  of  sections  of  graphs  of 
previsible  times,  and  the  graph  of  a  totally  inaccessible  time  is  disjoint  with  the 
graph  of  every  previsible  time.  Parts  (i)  and  (ii)  of  the  definit  can  be  written 
P<UlTn  =  T][T<  oc])  =  1,  and  P([T  =  S][T<oc])  =  0,  respectively. 

n 


Remark:  It  is  clear  from  the  definition  that  if  T  is  previsible  then  it  is  accessible 
and  optional.  The  example  of  Dellacherie  and  Meyer  at  the  end  of  the  last  sect  inn 
provides  a  case  where  a  stopping  time  is  nonprevisible  and  accessible  We  state 
of  some  of  their  observations  on  this  example 


We  noted  that  T,  the  identity  mapping  as  defined  there  is  a  G(t-H  but  tint  i 
Ci(t)  stopping  time.  Dellacherie  and  Meyer  show  further  that  even  (<. un¬ 
stopping  time  is  G-predictable.  Continuing  with  this  example,  a  poTalelin 
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measure  P  is  introduced  and  G  is  completed  relative  to  P.  Call  the  completed 
filtration  G  .  Dellacherie  and  Meyer  show  the  following: 

(a)  If  P  is  nonatomic.  then  G  satisfies  G  (S)  =  G  (S-)  for 
all  previsible  times,  S.  Also,  the  identity  mapping,  T.  is 
totally  inaccessible. 

(b)  IfP  is  purely  atomic  and  nondegenerate,  then  the  iden¬ 
tity  mapping,  T,  is  a  nonpredictable,  accessible  time. 

2.6.4.  Definition:  If  T  is  a  stopping  time  and  A  is  an  event  (AeH),  set  TA(w)  = 
T(w),  if  weA,  and  =  oo,  otherwise.  Then  TA  is  called  the  restriction  of  the 
stopping  time,  T,  to  the  event  A. 

It  follows  immediately  from  the  definition  that  [TA<t]  =  Ap|[T<t].  Then,  from 
the  definition  of  F(T),  TA  is  a  stopping  time  iff  AcF(T). 

Remark:  .As  can  be  guessed,  the  graph  of  any  stopping  time  can  be  written 
(uniquely,  a.s.P)  as  the  union  of  the  graphs  of  accessible  and  totally  inaccessible 
times. 

2.6.5.  Theorem  (Dellacherie): 

Let  T  be  a  stopping  time.  Then  there  exist  events  A  and  B  in  F(T-)  which  consti¬ 
tute  a  unique,  up  to  P-measure  zero,  partition  of  [ T<oo],  such  that  TA  is  accessi¬ 
ble  and  Tb  is  totally  inaccessible. 

We  now  mention  a  sequence  of  stopping  time  results  that  are  useful  in  the  study 
of  stochastic  processes.  Our  principal  use  will  be  in  the  last  of  these  results 
which  gives  a  characterization  of  previsibility  of  restrictions  of  stopping  times. 

o  Let  S  and  T  be  previsible  (accessible,  totally  inaccessible)  times.  Then  the 
minumun  and  the  maximum  of  S  and  T  are  previsible  (  accessible,  totally  inac¬ 
cessible). 

o  Let  T  be  a  stopping  time  and  AfF(t).  If  T  is  accessible  (totally  inaccessible) 
then  Ta  is  accessible  (totally  inaccessible). 

(This  is  immediate  from  [[T^jCdT]].) 

o  Let  T  =  lim  Tn. 

n— oo 


w 


(a)  If  (Tn)  is  an  increasing  sequence  and  each  Tn  is  previsi¬ 
ble,  then  T  is  also  previsible. 

(b)  If  (Tn)  is  a  decreasing  sequence  and  for  each  weft  there 
exists  a  natural  number  n  =  n(w)  such  that 
Tn(w)  =  T(w),  then  T  is  previsible  (accessible)  whenever 
the  Tn  are  previsible  (accessible). 

From  this  result  it  can  be  shown  that 

o  IfT  is  a  stopping  time  then  the  collection  of  all  AeF(T)  such  that  TA  is  prev¬ 
isible  is  closed  under  countable  unions  and  countable  intersections. 

This  result  can  then  be  used  to  show  the  following  important  result  that  will  be 
used  several  times  in  the  sequel: 

o  Let  T  be  a  stopping  time  and  AeF(T).  Then  if  TA  is  previsible,  AeF(T-). 
Conversely,  if  AeF(T-)  and  T  is  previsible,  then  TA  is  previsible. 

2.7.  Previsible,  Accessible,  Optional  Processes:  Let  X  be  a  stochastic  pro¬ 
cess  on  (ft,H)  and  recall  that  H  has  been  taken  to  be  the  smallest  sigma  algebra 
containing  the  union  of  all  members  of  the  filtration,  F,  and  then  denoted  F(oe). 

If  T  is  a  positive  r.v.  on  (fLH),  then  by  X(T)  we  mean  the  mapping  w  — ► 
X(T(w),w)  of  ft  into  R.  If  X  is  B[0,oo)  X  H-measurable,  then  this  mapping  defines 
a  r.v.  since  it  is  the  composition  of  the  measurable  mappings  w  — *•  (T(w),w)  and 
(t,w)— X(t,w). 

When  X  is  a  Skorokhod  process,  Meyer  [1973]  gives  a  simple  method  for  approxi¬ 
mating  X(T),  by  X(Tn),  where  for  each  n,  Tn  is  a  countable  valued  random  vari¬ 
able  and  the  sequence  (Tn)  decreases  point-wise  to  T:  Let  Dn  =  {k/2n  :  keZ+}. 
and  set  Tn(w)  equal  to  the  infimum  of  Dn  p|  (T(w),oo).  Then  the  right  con¬ 
tinuity  of  X  gives  X(Tn)  — <•  X(T),  a.s.P. 

X(T)  is  called  the  process  evaluated  at  time  T.  In  general,  we  will  allow  T  to 
be  an  arbitrary  stopping  time.  This  means  that  T  will  be  allowed  to  take  the 
(nonreal)  value  oo  of  the  extended  set  of  positive  real  numbers,  R  +  .  Since  we 
define  processes  X  on  R+Xft  and  not  R+Xft.  we  write  XT  1[t<v]  to  denote  XT 
on  the  event  [T<oo]  and  zero  on  the  event  [T=oo].  We  give  the  following 
sufficient  condition  for  the  F(T)  measurability  of  X(T). 


2.7.1.  Theorem 

If  X  is  F-progressively  measurable  and  T  is  an  F-stopping  time,  then  1[t<  x]  -^7 T ) 
is  F(T)-measurable. 

2.7.2.  Remark:  We  will  give  a  sketch  of  Dellacherie’s  proof  of  this  result.  Let  A 
be  any  Borel  set  of  the  real  line.  We  must  show  that  (X(T)cA]p|[T<t]fF(t),  for 
all  nonnegative  t.  But  the  event  formed  by  this  intersection  is  equal  to 
[X(S(t))f A]p|[T<t]  where  S(t)  =  min(T,t)  is  easily  seen  to  be  F(t (-measurable. 
But  the  process  XoS  is  measurable  relative  to  F(t),  since  it  is  obtained  as  the 
composition  of  the  mappings  w— +(S(w),w)  of  ( 0,F( t ) )  into  ( [0, t]  XfLB[0,t]  X  F( t ) ) 
and  (s,t) — *-X(s,w)  of  ([0,t]  X  f2,B[0,t]  xF(t ))  into  (R,B),  and  because  of  the 
definition  of  a  progressive  process. 

2.7.3.  Remark:  .As  in  Chapter  1,  an  important  example  is  obtained  when  the 
process.  X,  is  evaluated  at  the  random  time  S  :=  T~t,  where  T(w)~t 
min(T(w),t)  for  t>0.  Then  X(S)  is  called  the  process  stopped  at  time  T  and 
denoted  XT  Th  us,  XT(t  ,w)  =  X(T(w)-t,w),  and  T*t  is  sometimes  called  a 
truncation  of  T.  The  use  of  stopped  processes  is  fundamental  to  the  modern 
theory  of  martingales.  As  noted  in  Chapter  1,  one  reason  for  this  is  the  Doob 
Optional  Sampling  (Stopping)  Theorem  and  another  is  based  on  the  concept  of 
localization  to  be  discussed  at  some  length  in  Chapter  6. 

Another  important  stopping  time  that  can  be  constructed  from  T  is  the  transla¬ 
tion  of  T:  Tt  =  T  +  t,  tcR+.  Then  X(Tt)  is  called  a  random  shift  of  X.  (For 
more  information  see  Chung,  Doob  [1965].) 

2.7  -4.  Remark:  For  future  use,  we  point  out  that  the  filtration  F =( F( t ) )  is  said 
to  be  quasi-left  continuous  iff  F(T)=F (T- )  for  each  previsible  time.  T.  It  can 
then  be  shown  that  quasi-left  continuity  is  equivalent  to  accessible  times  being 
previsible. 

2.7.5.  In  what  follows  we  will  often  use  the  term  “optional”  time  in  place  of 
“stopping”  time. 

2.7.6.  Definition  :  PT(F)  :=  “family  of  F-previsible  times”;  AT(F')  :=  “family 
of  F-accessible  times”;  OT(F)  :=  “family  of  F-optional  times”,  where  F  is  the 
filtration  (F(t)). 

F  will  usually  not  be  mentioned  and  in  these  cases  we  will  just  write  FT.  AT. 
and  OT.  We  now  define  three  sigma  algebras  of  events  generated  by  stochastic- 
intervals  from  each  of  these  families.  Let  K  represent  any  one  of  the  family  of 
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stopping  times  PT,  AT,  or  OT  and  set 

G(K)  :=  a{  [[S,T))  :  SeK,  TcK  } 

Remark:  Since  previsible  times  are  accessible,  G(PT)  C  G(AT)  and  accessible 
times  are,  in  particular  stopping  times,  G(AT)  C  G(OT).  If  we  let  G(Prog) 
denote  the  <r-algebra  generated  by  progressive  random  sets,  we  have  that  G(OT) 
C  G(Prog)  by  applying  Lemma  2.3.11  to  1  [[s.T))-  Hence, 

G(PT)  C  G(AT)  C  G(OT)  C  G(Prog) 

2.7.7.  Theorem; 

G(K)  :=  <T {  [[S,T]]  :  SeK,  TeOT  }, 
where  K  is  PT,  AT,  or  OT. 

To  see  this,  let  SeK  and  TeOT.  Then  notice  that 


[[S.T]]  =  p  ][S,T+(l/n))) 

n>0 


and  T  +  (1/n)  is  previsible,  hence  accessible  and  optional.  Therefore  the  genera¬ 
tors,  [[S,T]],  can  be  obtained  from  the  defined  generators  of  G(K), 
K=PT,AT,OT.  Since 


[[T|]  =  n[lT-T+(l/n)|] 

n 


and  [[S.T))  =  [[S,T]j  -  [[T]],  the  reverse  is  true  and  the  proof  is  complete. 

Remark:  This  demonstration  also  proves  that  [[T]]cG(K),  if  TfK,  K  =  PT.  AT. 
OT.  So,  for  example,  previsible  times  have  previsible  graphs.  Not  surprising,  but 
certainly  comforting.  Notice  that  although  it  is  true,  we  have  not  proved  the  con¬ 
verse.  The  only  proof  that  I  am  aware  of  requires  the  so-called  “(Cross)  Section 
Theorem”  from  Capacity  theory.  This  will  be  mentioned  in  the  Chapter  on  Prev¬ 
isible  Projections  (Section  4.4). 


Now  that  we  know  that  optional  and  previsible  random  sets  are  progressive,  the 
next  Theorem  follows  from  Dellacherie's  result  stated  earlier,  saying  that  progres¬ 
sive  random  sets  have  optional  debuts. 


2.7.8.  Theorem: 

If  the  random  set  AcG(OT)  or  G(PT),  then  the  debut  of  A  is  a  stopping  time. 

2.7.9.  Definition  :  The  process,  X.  on  (Q,H)  is  said  to  be  an  optional,  accessi¬ 
ble,  or,  previsible  process  according  to  whether  X  is,  respectively,  G(OT)- 
measurable,  G(AT)-measurable,  or  G(PT)-measurable. 

We  have  already  used  the  following 

2.7.10.  Corollary: 

//X  is  an  optional  process  and  B  is  a  real  Borel  set  then  the  hitting  time  of  B  is  a 
stopping  time. 

2.7.11.  Remark:  Let  0(w)  =  0  for  all  well;  0  is  called  the  zero  stopping  time. 
As  with  all  boundary  cases,  it  is  instructive  to  satisfy  oneself  that  0  is  a  previsible 
stopping  time.  Further,  0A  is  previsible,  if  AcF(O).  This  is  easy  to  show  by  con¬ 
structing  an  announcing  sequence.  For  example,  let  T(n,w)  :=  n  lg(t,w),  where  B 
is  the  complement  of  A.  Then  T(n,w)  =  n  on  [0A  >  0]  =  [0A  =  oc]  =  B,  and 
T(n,w)  =  0  on  [0A  =  0]  =  A.  Hence,  (T(n))  is  strictly  increasing  on  [0A  >  0], 
approaches  oo  where  0A  is  infinite,  and  is  identically  zero  where  0A  vanishes. 
Finally,  each  T(n)  is  a  stopping  time,  since  [T(n)  <  t]  =  AfF(O)  C  F(t)  for  all  t. 
0<t<n,  and  for  t>n,  [T(n)  <  t]  =  A  (J  B  =fl,  which  is  in  every  F(t). 

The  (T-algebras  G(K),  Kf  {PT,AT,OT},  were  defined  by  varying  the  type  of  stop¬ 
ping  time  in  intervals  of  the  form  [(S,T)).  This  was  reduced  to  just  closed  sto¬ 
chastic  intervals  with  only  the  left  end-point  determining  the  type  of  measurabil¬ 
ity.  The  next  result  shows  that  intervals  of  the  form  ( ( S,T] ] ,  with  S  and  T  both 
optional,  are  sufficient  to  generate  the  previsible  cr-algebra,  G(PT),  provided  that 
we  account  for  zero  stopping  times. 

2.7  12.  Theorem:  (Dellacherie  (1972,  p.67  ff)) 

G(PT)  is  generated  by  [[  0A  ]],  where  AtF(O),  and  by  ( (S,T]] ,  where  S  and  T  are 
optional. 

It  follows  immediately  that  G(PT)  is  also  generated  by  the  random  sets,  B  X 
(s,t],  where  B«F(s)  and  s<t  are  any  real  numbers,  together  with  {0}  X  B,  where 
BfF(0).  (  We  will  call  the  indicator  process  of  these  sets  the  kernel  process  of 
G(PT).  ) 

On  the  other  hand,  the  indicator  processes  corresponding  to  B  X  ( s,t]  are  left 
continuous.  Hence,  G(PT)  is  contained  in  the  cr-algebra  generated  by  left 


continuous  processes.  The  converse  statement  and,  consequently,  part  (i)  of  the 
next  theorem  follow  from  the  fact  that  left  continuous  processes  can  he  obtained 
as  limits  of  linear  combinations  of  the  kernel  processes  of  G(PT): 

2.7.13.  Theorem. 

( i )  C(PT )  is  generated  by  left  continuous,  adapted  processes. 

(ii)  G( O T )  is  generated  by  Skorokhod,  adapted  processes. 

2.7.14.  The  following  example  is  standard.  Set 

X(t,w)  :=  Z(w)  l[[s,Tj)  (t,w). 

Then 

(a)  X  is  optional  if  ZcF(S,  where  S.T  are  optional; 

(b)  X  is  accessible  if  ZcF(S),  where  S,T  are  accessible; 

(c)  X  is  previsible  if  ZcF(S-),  where  S,T  are  previsible; 

(d)  Y(t,w)  =  Z(w)  1((st]]  *s  previsible  if  S  and  T  are 
optional  and  ZeF(S). 

For  (a),  first  let  Z  =  1A,  AfF(S).  Then  Z  is  F(S)-measurable  and  Z  l[[s,T))  = 
1[[Sa.TaJ)’  wh>ch  is  optional.  Thus,  the  statement  holds  for  indicators,  hence  for 
simple  functions,  hence  for  limits  of  sequences  of  non-negative  simple  functions, 
etc. 

2.7.15.  Remark:  Part  (i)  of  the  last  theorem  might  be  stated  more  explicitly  as 
follows:  G(PT)  is  generated  by  mappings  f  from  [0,oo)Xft  into  R  such  that  each 
function  t  — ►  f(t,w)  is  left  continuous  and  each  function  w  — <■  f(t,w)  is  F(t)- 
measurable. 

2.7.16.  Remark:  Part  (i)  of  the  last  theorem  guarantees  that  every  left  continu¬ 
ous  processes  is  previsible  (  hence  also,  every  continuous  process).  However,  not 
every  previsible  process  is  left  continuous.  For  example,  if  T  is  a  stopping  time, 
then  T+l  is  previsible  so  that  1  [[T-t- i]J  *s  a  Prt“visible  process.  But  this  process  is 
not  left  continuous. 


2.7.17.  Remark:  The  “modern”  (post  Dellacherie  [1972])  way  of  defining  G(PT) 
is  to  use  part  (i)  of  the  last  Theorem  as  a  definition,  while  not  assuming  that  the 
filtration,  F,  is  right  continuous.  Then,  having  defined  previsible  events,  a  stop¬ 
ping  time  is  called  previsible  if  and  only  if  its  graph,  [[TJ],  is  a  previsible  subset  of 
(0,oo)xn.  Our  definition,  in  these  notes,  then  holds  as  a  theorem  with  ( F( t ) ) 
replaced  by  (F(t+)).  For  such  a  development  see  Metivier(  1982).  Under  the 
“usual  conditions"  these  two  approaches  are  equivalent. 

2.7.18.  Remark:  We  have  not  said  nearly  enough  about  stopping  times,  neither 
their  properties  nor  use  in  studying  processes.  So  we  will  look  with  a  little  more 
detail  at  one  small  sequence  of  results  that  are  important  in  the  sequel. 

First,  letting  A  be  a  random  set,  we  extend  the  last  definition  of  debut  (Section 
2.4.8)  by  setting  :=  DA  and  defining,  for  each  ncZ+, 

DA"’  -  inf{tcR+:[0,t]pjA(w)  contains  at  least  n  elements}, 

where  A(w)  is  the  section  of  A  at  w.  D(n',  is  called  the  n-debut  of  A.  We  have 
stated  earlier  that  if  A  is  progressive,  then  DA,  and  so  D^1',  is  a  stopping  time. 
Using  this  fact,  we  can  show  by  induction  on  neZ+  that  when  A  is  progressive 
then  each  n-debut  is  a  stopping  time.  To  see  this,  just  observe  that  we  can 
write 

DA"+I)  =  DAn((D|n’,oo))- 

Given  that  A  is  progressive,  this  equation  exhibits  D^n+1*  as  the  debut  of  a  pro¬ 
gressive  set,  if  D{n'  is  a  stopping  time.  Observing  that  the  1-debut  is  a  stopping 
time,  and  making  an  induction  assumption  that  the  n-debut  is  a  stopping  time,  it 
follows  then  that  the  (n  +  l)-debut  is  a  stopping  time.  Therefore,  by  induction 
(D^k>,kcZ+)  is  a  sequence  of  stopping  times. 

The  following  definition  and  Theorem  are  included  here,  not  only  because  they 
will  allow  us  to  “prove”  some  results  in  Chapter  3  and  beyond,  but  also  because 
they  give  an  indication  of  the  spirit  in  which  the  use  of  stopping  times  give  intui¬ 
tive  meaning  to  what  could  otherwise  be  a  tedious  litany  of  analytic  conditions. 

2.7.19.  Definition:  Let  X  be  a  Shorokhod,  F-adapted  process.  Then  X  is  said 
to  charge  a  stopping  time,  T,  if  P(T  <  oo,  X(T)  7^  X(T-))  >  0  and  to  have  a 
jump  at  a  stopping  time,  T,  if  P(T  <  00,  X(T)  7^  X(T-))  =  1.  Further,  a 
sequence,  (Tn),  of  stopping  times  is  said  to  exhaust  the  jumps  of  X  if 


(i)  X  has  a  jump  at  each  Tn,  neZ  +  , 


(in  iminiiTjii  =  «.  i/j, 

(iii)  X  does  not  charge  any  other  stopping  times. 

2.7.20.  Remark:  Let  X  be  an  adapted,  Skorokhod  process,  then  X.  is  a  previsible 
process.  Set  A  =  {X  ^  X_}.  For  each  ncZ+,  let  An  =  {  |  X  —  X_  |  >  — }. 
Then  AneG(OT)  for  all  n  and 

A  =  U'V 

Now,  since  X  is  Skorokhod  the  sections,  An(w),  have  no  cluster  points  in  R+  for 
each  n  and  all  w£fl0,  where  P(ft0)  =  1.  It  follows  that  each  An  is  the  union  of 
the  graphs  of  its  k-debuts  and  so  A  is  contained  in  the  union  of  a  countable 
number  of  graphs  of  stopping  times.  We  need  the  following: 

2.7.21.  Lemma  (Dellacherie,  1972,  IV  T17):  If  AcG(K),  where  K  is  either 
the  class  of  previsible  or  accessible  or  optional  times,  and  AC(j[[Sn]]  for  any 
sequence  of  stopping  times,  then  there  exists  a  sequence ,  (Tn),  with  TnfG(k)  for 
each  n  and 

A  =  \j[[Ttl)), 

and  the  graphs  of  the  Tn  are  pairwise  disjoint. 

Combining  this  Lemma  and  the  previous  remarks  we  have 

2.7.22.  Theorem:  (Dellacherie,  1972,  IV  T30) 

(i)  If  X  is  any  adapted,  Skorokhod  process,  then  there  exists 
a  sequence  of  stopping  times,  (Tn),  which  exhaust  the  jumps 
ofX 

(ii)  If  X  is  previsible  (accessible),  then  the  (Tn)  in  part  (i) 
are  previsible  (accessible). 

Again,  let  X  be  an  adapted,  Skorokhod  process,  then  from  part  (ii)  of  this 
Theorem  we  see  that  if  X  is  accessible,  X  cannot  charge  any  totally  accessible 
time.  The  converse  of  this  statement  is  also  true  and  is  a  result  of  the  following 
observations.  Let  X  be  adapted  and  Skorokhod.  Then  since  X  does  not  charge 
any  totally  inaccessible  time,  we  know  from  part  (i)  of  the  Theorem  that  the 
sequence  (Tn)  which  exhausts  the  jumps  of  X  must  be  accessible.  Then 
A  =  (^j[[Tnj]  is  accessible  and  its  complement  B  :=  {X  =  X_}  is  accessible. 
Since  X  =  lgX_  +  an(l  >s  previsible,  it  follows  that  X  is  accessible. 


Therefore,  the  following  Corollary  holds. 


2.7.23.  Corollary. 

Let  X  be  an  adapted,  Skorokhod  process. 

Then  X  is  accessible  iff  X  does  not  charge  any  totally  inac¬ 
cessible  time. 

Remark:  Earlier  we  introduced  quasi-left  continuous  filtrations.  The  following 
result  leads  to  an  analogous  class  of  processes: 

2.7.24.  Theorem: 

Suppose  that  X  is  adapted  and  Skorokhod.  Then  the  following  statements  are 
equivalent 


(i)  The  jump  times  of  X  are  totally  inaccessible; 

(ii)  X  does  not  charge  previsible  times; 

(iit)  If  the  stopping  times  Tn  f  T  then 
limX(T„)  =  X(T)  on  }T<oo],  a. s.P. 

n 


Remark:  A  process  X  satisfying  any  one  of  these  conditions  is  said  to  be  a 
quasi-left  continuous  process.  Later  in  this  chapter  we  will  point  out  that 
each  Shorokhod  martingale  is  quasi-left  continuous  when  the  underlying  filtration 
is  quasi-left  continuous. 


From  the  previous  theorem  on  the  jumps  of  Skorokhod  processes,  we  can  see  that 
if  the  Skorokhod  process  X  is  previsible,  then  X  is  quasi-left  continuous  iff  X  is 
(a. s.P)  continuous.  A  more  important  result  concerning  Skorokhod  previsible 
processes  is  given  by 

2.7.25.  Theorem:  (Dellacherie,  Meyer  [1980]) 

Let  A  be  a  Skorokhod  process.  Then  X  is  previsible  iff  the  following  two  condi¬ 
tions  hold: 

(a)  AXj  =  0,  a. s.P,  for  all  totally  inaccessible  stopping 
times  T, 

(b)  For  every  predictable  stopping  time  T,  XT  is  F(T-)- 
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measurable  on  [T<oc] 

Remark:  Part  (a)  confirms  the  intuitive  fact  that  previsible  processes  cannot 
jump  at  totally  inaccessible  times.  Part  (b)  of  this  theorem  can  be  strengthened 
as  follows: 

If  X  is  previsible,  then  the  random  variable  ljTcoopM'  ,s  F(T-)  measurable  for  all 
stopping  times  T.  This  exactly  expresses  the  meaning  of  previsibility. 

Since  we  are  not  going  to  prove  the  Theorem,  it  would  be  helpful  to  prove  the 
last  remark.  Perhaps  more  importantly  this  can  be  accomplished  by  a  relatively 
standard  argument  based  on  the  Monotone  Class  Theorem  (Appendix  A).  There 
are  numerous  places  in  this  note  where  this  device  should  be  used,  but  isn’t.  So 
we  will  take  a  moment  and  at  least  show  the  setup.  Since,  in  this  case,  X  is 
previsible,  we  first  look  at  the  kernel  processes  Xt(w)  =  1a(w)1(u,cc)(,')>  where 
A«F(u)  and  0<u.  Then  XT  =  1a1[u<t]-  Since  T  is  F(T-)  measurable  and 
AfF(u)  1a1[u<T]1[T<co]  is  F(T-)  measurable.  But  Ap|[u<T]  is  a  generator  of 
F(T-).  Therefore,  when  X  has  this  simple  form  1[t<co]^T  *s  F(T-)  measurable. 

Now  let  H*  be  the  set  of  all  such  processes  X  such  that  1[t<cc]^T  *s  F(T-) 
measurable  for  all  stopping  times  T.  Also,  let  L  be  the  collection  of  all  subsets  of 
(O.oo)Xfi  of  the  form  (u,oo)XA,  u>0.  AeF(u).  Then  leH*  and  lBfH*  when  B  is 
in  L.  Next,  it  would  have  to  be  shown  that  if  (Xn)  is  an  increasing  sequence  of 
nonnegative  functions  in  II*  such  that  supnXn  is  finite,  then  supnXn  is  in  H*. 

The  Monotone  Class  Theorem  then  states  that  H*  contains  all  processes  measur¬ 
able  with  respect  to  <r(L)  =  G(PT),  as  desired. 

As  a  final  remark  about  the  meaning  of  the  result  itself,  recall  that  if  X  is  previsi¬ 
ble,  then  it  is  progressive.  Since  it  is  progressive,  we  know  that  X(T)  is  F(T) 
measurable.  Thus,  we  see  that  the  more  restrictive  assumption  of  previsibility, 
produces  the  sharper  result  that  X(T)  is  F(T-)  measurable  (as  we  would  expect 
from  the  intuitive  meaning  of  previsibility). 

2.8.  Martingales:  This  small  section  contains  a  list  of  some  basic  results  on 
martingales  that  will  be  needed  in  the  remaining  parts  of  this  note. 

As  in  previous  sections,  all  processes  will  be  considered  relative  to  a  probability 
space  (n.H.P)  equipped  with  a  filtration.  F=( F( t ).t  >0).  I’nless  stated  otherwise, 
we  assume  that  F  satisfies  the  “usual  conditions  ". 


We  have  already  discussed  the  martingale  concept  in  Chapter  1,  and  will  only 
note  that  if  X  is  some  stochastic  process,  h  >  0,  and  we  want  to  estimate  the 
increment  process,  t— >X(t+h)  -  X(t)  on  the  basis  of  information  that  has  accrued 
up  to  and  including  time  t,  then  a  reasonable  estimator  is 

C'( t )  :  =  E(X(t  +  h)  -  X(t)  |  F(t)).  If  ip  =  0  {i>  >  0,  <  0)  then  according  to 

the  following  definitions,  X  is  a  martingale  (submartingale,  supermartingale). 

2.8.1.  Definition:  A  F-martingale,  m,  is  a  P-integrable  process  satisfying 

E(m(t)  |  F(s))  =  m(s)  ,  (a.s.P), 

for  all  t>s>0. 

From  the  properties  of  conditional  expectation,  martingales  are  F-adapted  by 
their  definition.  Supermartingales  are  P-integrable,  F-adapted  processes,  Y. 
such  that  Y(s)  >  E(Y(t)|F(s)),  a.s.P,  for  all  t>s>0.  Finally,  X  is  an  F- 
submartingale,  if  -X  is  a  supermartingale.  Clearly,  a  martingale  is  both  a 
supermartingale  and  a  submartingale. 

2.8.2.  It  is  proved  in  Meyer  [1967]  that 

2.8.3.  Lemma:  If  the  filtration  F  satisfies  the  “usual  conditions",  then  an  F- 
submartingale  Y  has  a  Skorokhod  (right  continuous  with  left  limits)  modification 
iff  the  mapping  t—*EY(t )  is  right  continuous. 

2.8. 1.  Since  we  assume  the  "usual  conditions"  such  modifications  always  exist  for 
martingales.  (This  follows  directly  from  the  definition  of  martingale  since  Em(t) 
=  Em(0).  t>0.  That  is,  martingales  have  constant  mean  value  functions.)  Com¬ 
bining  Lemmas  2.3.3  and  2.8.3,  we  can  and  will  always  identify  a  martingale 
with  its  Skorokhod  modification.  Actually,  Meyer  proves  that  if  a  submar¬ 
tingale  is  right  continuous  then  it  has  finite  left  limits  a.s.P.  and  states  Lemma 
2.8.3  for  right  continuous  submartingales. 

However,  if  X  is  an  F-submartingale  which  is  not  right  continuous,  all  that  can 
be  said  for  it  is  that  for  each  t  >  0  and  a.s.P  all  paths,  right  and  left  limits  exist 
at  t  for  the  restriction  of  X  to  any  countable  dense  subset  of  [O.cc).  That  is,  let¬ 
ting  O  be  the  set  of  nonnegative  rationals, 


P({w:  lim  X(s),  lim  X(s)  exist})  =  1. 

s— t  +  ,s(Q  s— t-,s<Q 


for  each  t  >0. 

Therefore,  we  can  define  the  process,  V,  by  setting  Y(t)  :=  lim  X(s).  for 

s—* t+,s(Q 

each  nonnegative  t  on  a  subset  C  of  0  where  P(C)  =  1  and  arbitrarily  on 
n  -  C.  so  that  Y  is  right  continuous.  Further,  Y  is  F-adapted  by  right  con¬ 
tinuity  of  our  filtrations:  F( t-t- )  =  F(t).  To  see  that  Y  is  also  an  F- 
submartingale,  let  (hn)  be  a  sequence  of  nonnegative  real  numbers  decreasing  to 
zero.  Then  the  sequence  (Yt+h  )  is  a  “reversed  submartingale”,  due  to  the  fact 
that  the  original  process  X  is  a  submartingale,  which  can  be  shown  to  be  uni¬ 
formly  integrable.  We  will  spend  some  time  in  a  few  paragraphs  discussing  the 
uniform  integrability  condition,  but  for  now  it  is  enough  to  know  that  it  is 
sufficient  for  a.s.P  convergence  to  imply  convergence  in  L](P).  Letting  AcF(s) 
and  s<t.  and  applying  this  result  to 

EY(s)lA  =  lim  EX(s+hn)lA  <  lim  EX(t+hn)lA  =  EY(t)lA, 

n— xc  n— *-co 

>ve  obtain  Y(s)  <  E(Y(t)  |  F ( s ) ) ;  Y  is  an  F-submartingale.  Y  is  called  the  right 
continuous  modification  of  X.  Thus,  under  the  “usual  conditions”  a  right  con¬ 
tinuous  modification  of  X  always  exists. 

In  the  same  manner,  one  shows  that  EX(t)lA  <  EY(t)lA,  for  all  AcF(t),  so  that 
X(t)  <  Y(t)  a.s.P  for  all  t.  It  follows  from  this  last  statement,  that 
X(t)  =  Y(t)  a.s.P,  for  each  t  iff  EX(t)  =  EY(t)  for  each  t.  This  is  basically 
the  content  of  Lemma  3. 

2.8.5.  Remark:  There  are  a  number  of  results  from  classical  martingale  theory 
that  will  be  needed  in  the  following  chapters.  One,  Doob’s  Optional  Sampling 
(Stopping)  Theorem,  has  already  been  stated  and  proved  in  the  discrete  parame¬ 
ter  case.  The  continuous  parameter  version  of  this  theorem,  and  others  to  be 
stated  later,  follows  in  a  relatively  simple  manner  from  the  discrete  version  when 
the  “usual  conditions”  obtain  and  the  processes  are  Skorokhod.  For  brief,  self- 
contained  proofs  see  N.  Ikeda  and  S.  Watanabe  [1981].  K.  Chung  [1974,  1983]  is 
also  an  excellent  source. 

To  state  these  theorems  in  a  form  convenient  for  application  in  Chapter  6,  we 
first  introduce  some  terminology  for  martingales  which  have  finite  moments  of 
order  p:  M  is  said  to  be  an  Lp  martingale  iff  \1  is  a  martingale  and  McLp,  where 
p  belongs  to  [1,  oo).  A  related  classification,  that  we  won't  use  very  often  until 
the  last  chapter,  is  Lp-bounded.  A  martingale  M  is  said  to  be  Lp-bounded  if 


fit. 


suPt>0E(  |  M(t)  |  p  )  <  oc.  L.  bounded  martingales  are  also  called  square 
integrable  martingales. 

Theorem:  (Doob’s  Inequality) 

Let  M  be  an  Lp-martingale  and  pt  [1.  co  ].  Set  Mt*  =  sup{  |  M,  |  :0<s<t|  . 
Then 
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XpP(  Mt*  >  X  )  <  /  |  Mt  |  p  dP 

[M,*>X| 

and,  if  p>  t  then  Mt*  f  Lp,  for  all  t  >  0.  and 


E{(Mt*)P  }  <  f)p  E{  I  Mt  lp}- 


2.8.6.  Remark:  Clearly,  E  |  \lt  j  p  <  E(Mt*  p).  So  in  terms  of  the  Lp  (fi.H.P) 
norm,  inequality  (2)  says  that 


IlMjp  <  lMt1p  <  ||Mj|p. 


Therefore,  when  p  >  1,  the  mappings  m— *||Mj|p  and  m— »||Mt*||p  ,  define 
equivalent  norms.  This  remark  will  be  extremely  important  in  Chapter  6, 
where  the  initial  analysis  will  take  place  with  Lo-bounded  (square  integrable) 

processes. 

2.8.7.  Remark:  The  inequality  (2)  is  usually  called  Doob’s  inequality.  Since  this 
inequality  is  of  great  importance  to  us  we  will  show  how  it  can  be  deduced  from 
(1):  An  application  of  integration  by  parts  gives 


E((Mt*)p)  =  p /  Xp-'P(Mt*>X)dX 


<  p  JXP~2  /  j  Mt  |  dP  dX 

o  [M*>x] 


p/^p  "  /  l|M,*>x]  IM.I  <iP  ^ 

0  -0- 


p/\P-2/P[Mt*>X.  |  Mt  |  >/i]d/i  dX 

0  0 


OO  CO 


=  -2-/ {(p-1 )/ Xp-“  P[Mt*>X,  |  \lt  |  >/i]dX  }d fi 
P  i  o  « 


=  tE7/e('iim,i>„i  (M.T1) 
P  1  0 


=  -2-r  E((\it*)P-'  |  \lt 

p-1 


p-»  I 

<  -E-  !(E(Mt*)P)  p  (E(MtP))P. 

p-1 


This  last  inequality  is  a  consequence  of  Holder's  inequality.  The  result  follows  by 
dividing  both  sides  of  the  last  inequality  by  the  first  term  on  the  right;  if  it  were 
zero,  there  would  be  nothing  to  prove. 

2.8.8.  Remark:  Uniform  integrability  of  a  family  of  functions  is  a  classical  con¬ 
cept.  ( E.g. ,  Meyer  [1967],  Loeve  [I960].)  Since  it  plays  a  somewhat  remarkable 
role  in  the  theory  of  martingales,  we  are  obliged  to  spend  some  time  discussing 
the  concept  and  its  application  to  martingales.  The  principal  use  of  this  material 
will  be  to  construct  the  Stochastic  Integral  in  Chapter  6. 

A  family,  'P,  of  P-integrable  random  variables  on  (O.H)  is  said  to  uniformly 
integrable  iff 


lim  sup  (  f  |  X(w)  |  P(dw)  :  X  (  $  )  =  0. 
a-°°  1  1 X I  >a  ] 


Lot  m  =  (m(t),  t>0)  be  a  martingale.  We  will  see  tha‘  for  =  {m(t).  t<R+j. 
uniform  integrability  can  be  characterized  by  the  requirement  that  there  exists  a 
P-integrable  r.v.  Z  which  closes  the  martingale  in  the  sense  m(t)  =  E(Z  |  F(t)j. 
for  all  t>0.  In  this  case  it  can  be  shown  that  Em(t)  =  EZ  for  all  t>0. 

lim  m(t)  =  Z  ,  a.s.P, 

t— 'OO 

and  also  in  Lj.  Z  is  usually  denoted  by  2(oc),  and  is  called  the  terminal  ran¬ 
dom  variable  of  the  process  m. 

We  will  now  indicate  how  this  result  and  some  others  are  derived  with  the  aid  of 
uniform  integrability.  More  exactly,  we  will  discuss  uniform  integrability  and  its 
impact  on  supermartingale  and  martingale  sequences.  The  transfer  of  these 
results  to  the  ‘'continuous”  parameter  case  is  simple  for  the  processes  under  con¬ 
sideration  in  this  note  (they  are  Skorokhod  processes). 

We  will  quote  two  principal  sources  as  we  proceed  and  the  interested  reader  can 
refer  to  these  for  complete  details.  However,  an  attempt  will  be  made  to  supply 
the  basic  mathematical  ideas  that  yield  the  results.  First  of  all,  Mever[1967. 
p.I7j  points  out  that  every  finite  family  of  processes  is  uniformly  integrable  and 
every  family  majorized  by  a  P-integrable  process  is  uniformly  integrable.  To 
understand  his  remark  about  finite  families,  consider  if  =  {h},  a  family  with 
only  a  single  L^P)  random  variable.  Then 

J  |  h(w)  )  dP(w)  —  0,  as  a  — *  oo  , 

[  |  h  |  >  a] 

since  hcLj(P),  P([|h|  >  a])  — *  0  as  a  — ♦  oo,  and  the  measure  determined  by  the 

map  B  — »J  |  h  |  dP  of  H  into  R+  is  absolutely  continuous  relative  to  P. 

B 

The  case  for  finite  $  follows  immediately,  as  does  the  case  where  a  family  is 
dominated  by  a  single  P-integrable  function.  These  observations  are  essentially 
contained  in  a  characterization  given  by  Meyer[1967,  IIT19]  which  states  that 
uniform  integrability  is  equivalent  to  the  uniform  boundedness  of  E|f|  for  all  R'k 
( i.e. ,  supf(VE  (  f  |  <  oo)  and  the  “uniform”  absolute  continuity  of  the  measures 

B— /  |  h  |  dP.  B  in  H,  f  in  H. 

B 

The  uniform  boundedness  condition,  supnE  |  fn  |  <  oo,  implies  that 
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sup  E  fn-  <  oo  and  sup  E  fn+  <  oo  . 


(1) 


Because  of  the  monotonicity  of  their  expectations,  supermartingales  with  the  first 
part  of  condition  (1)  are  uniformly  bounded;  the  same  is  true  for  submartingales 
with  the  second  part  of  condition  (1). 

It  is  well  known  that  condition  (1)  is  a  sufficient  condition  for  a  supermartingale 
(submartingale)  (  fn,  Fn,ncZ+  )  to  converge  a.s.P  to  a  terminal  random  variable, 
denoted  f^,  with  the  property  that  E(  f^)  <  lim  E  fn. 

n 

Another  consequence  of  uniform  integrability  (Meyer[1976,  II  T2 1] )  is  that  it 
extends  Lebesgue’s  Theorem  and  tells  us  that  the  a.s.P  convergence  of  fn  to  f^. 
also  takes  place  in  Lj(P).  Therefore,  Ef^  =  lim  Efn. 

n— OO 

Thus,  if  (fn,  Fn  )  is  a  uniformly  integrable  supermartingale,  then  this  supermar¬ 
tingale  converges  a.s.P  and  in  Lj  to  a  terminal  random  variable.  Consequently, 
(fn  ,  Fn  ,  ih  Z  +  ),  where  Z+  =  Z+ (Joo  ,  is  also  a  supermartingale.  That  is, 
under  uniform  integrability,  the  time  domain  of  a  supermartingale  can  be 
extended  to  Z+  in  an  obvious  manner  and  the  resulting  process  continues  to  be  a 
supermartingale. 

In  terms  of  martingales,  this  says  that  for  every  n,  we  can  write 
fn  =  E(  f^  |  Fn  ).  Moreover,  the  converse  of  this  result  is  true  in  the  follow¬ 
ing  sense:  If  there  exists  an  Lj  random  variable,  U,  such  that 
fn  =  E(  U  |  Fn  ),  then  (fn  ,  Fn  )  is  a  uniformly  integrable  martingale.  That 
(E(  U  |  Fn  ))  is  a  martingale  is  obvious.  That  it  is  uniformly  integrable  follows 
from  the  following  Lemma,  which  is  of  general  interest. 

2.8.9.  Lemma 

Let  U  be  an  Lt  (P)  random  variable  and  C  be  a  collection  of  sub  -a -algebras  of  the 
a-algebra  H.  Then  the  family  {  E(  U  |  G  )  :  G  belongs  to  C  }  is  uniformly  integr¬ 
able. 

2.8.10.  Remark;  This  is  quite  easy  to  prove.  Just  use  the  Chebvshev  inequality 
for  positive  random  variables  and  Jensen's  inequality  to  show  that 


supG(CP( 


E(U  |  G)  |  >  a  )  <  -E  |  U 

a 
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We  now  state  three  Theorems  that  are  basic  to  the  development  carried  out  in 
Chapter  6. 

2.8.11.  Theorem:  (Martingale  Convergence) 

Let  M  be  an  F,  L  ^-martingale,  pc[l,oo]  and  suppose  that 
sup{  E  |  \lt  |  p  :  0<t<oo  }  <  oc.  Then  there  exists  a  random  variable 
(  Lp  such  that 

lim  Mt  =  a.s.P. 

t—oo 

Further,  if  p  =  1,  and  M  is  uniformly  integrable,  or  simply,  if  p>l,  then  (M J  is 
an  F,  Lp-martingale,  where  now  tc[0,oo],  the  extended,  positive  real  line,  with 
F(oo)  =  cr(  (jF(s)  ),  and  Mt  converges  to  Mw,  in  Lp,  as  t— »oo. 

s>0 

Remark:  See  Chung-Williams  [1983],  or  Meyer  [1967]. 

2.8.12.  Theorem. 

//  m  =  (mt,t>0)  is  a  Skorokhod  supermartingale  (submartingale)  and 
sups<tEmt'<oo  (sups<tEmt'f'<inf),  then  mt  — »  m^  as  t  -*•  oo,  a.s.P,  and 
m^cL^P). 

2.8.13.  Remark:  Recalling  condition  (1)  and  remarks,  it  follows  that  any  uni¬ 
formly  integrable  martingale,  m,  has  a  terminal  r.v.  m^:  mt  — ►  m^,  a.s.P,  in 
L,( P),  and  mt  =  Efm^  |  F(t)).  Conversely,  if  ZcL,(P),  there  exists  a  uniformly 
integrable  martingale,  m,  such  that  mt  =  EjZ  |  F(t)). 

2.8.1 1.  Theorem  (Doob’s  Optional  Sampling  Theorem): 

Let  X  be  a  Skorokhod  supermartingale  and  suppose  that  there  exists  a  r.v.  ^cL^P) 
such  that  Xt  >  E(Y  |  F(t)),  t  >  0.  Let  S  and  T  be  F-stopping  times  with 
S  <  T,  thenXs  andX^  are  P-integrable,  and  Xs  >  E(Xq>  |  F(T)). 


Chapter  3.  Increasing  Processes 


3.1.  Point  Processes:  The  reader  should  recall  the  discussion  on  discrete  point 
processes  in  Chapter  1.  Let  (T(n),n>0)  be  a  sequence  of  positive  random  vari¬ 
ables  defined  on  some  filtered  probability  space,  (ft,H,F,P).  This  sequence  is 
called  a  point  process  (PP)  if  (T(n))  is  an  increasing  sequence  on  Q  with  values 
in  (0,oo]  which  satisfies  T(n,w)  <  T(n+l,w)  for  each  natural  number  n  and  each 
w  in  Q,  if  T(n,w)  <  oo.  We  immediately  extend  the  definition  by  setting  T(0,w) 
=  0  and  T(oc,w)  equal  to  the  limit  of  the  T(n,w)  as  n  approaches  infinity,  for 
each  w  in  0.  Only  on  one  or  two  occasions  in  this  note  will  each  T(n)  not  be  a 
stopping  time  relative  to  some  non-trivial  filtration.  (As  random  variables,  the 
T(n)  are  always  stopping  times  relative  to  the  trivial  filtration,  which  is  defined 
as  H  for  every  “time”  t.) 

3.1.1.  The  “counting  process”,  N  =  (N(t),t>0),  associated  with  a  point  process 
(T(n),  n>0),  is  the  stochastic  process  defined  by  setting  N(t,w)  :=  n,  if  T(n,w)  < 
t  <  T(n  +  l,w),  and  :=  oo,  if  T(oc)<t.  It  follows  that  for  t>0  and  wefi, 

N(t,w)  =  V  l([T(n)iCo))(t,w). 

n>  1 

Since  N  and  (T(n))  both  contain  the  same  information  it  is  usual  refer  to  each  as 
a  point  process.  We  will  adopt  this  custom  and  reserve  the  name  counting  pro¬ 
cess  for  those  point  processes  which  are  non-explosive,  in  the  sense  that 

N(t)  <  oo  ,  for  all  real  t,  t>0. 


This  condition  is  equivalent  to 


im  T(n)  =  oo. 


Notice  that  the  non-explosive  condition  does  not  preclude  either  N(oc)  =  oc  or 
N(>:)<3c.  In  both  cases  lim  T(n)  =  oc.  If,  for  example,  N  only  has  a  single 
jump,  then  T(n)  is  equal  to  oc  for  n  >  2,  by  definition  of  the  sequence  as  a  point 
process.  The  jump  times  of  nonexplosive  point  processes,  our  counting  processes, 
do  not  have  finite  limit  points. 


Finally,  in  the  Chapter  on  Dual  Previsible  Projections,  it  will  be  shown  that 


61 


jtgl mV  ft.*,.  A  l! 


corresponding  to  every  point  process,  N,  there  is  a  unique  (a.s.P),  previsible. 
increasing  process  denoted  N,  called  the  previsible  compensator  or  the  dual 
previsible  projection,  such  that 


and 


N(t)  -  N(t)  is  an  F-martingale, 


(1) 


/  d  N(s)  =  0.  (2) 

[T(co),oo| 

Jacod(1975)  shows  that  there  is  a  version  of  N  satisfying  (2)  and  having  the  pro 
perty  that  AN<  1,  for  each  t  >  0. 

We  note  that  the  point  process,  N,  is  also  called  a  simple  point  process  by  vir¬ 
tue  of  the  fact  that  its  jumps  are  always  equal  to  1. 

When  we  want  to  remind  the  reader  of  the  underlying  filtered  probability  space, 
we  will  write  (N,P)  or  (N,F)  for  the  point  process  and  often  refer  to  the  (Pin¬ 
point  process. 

Although  we  will  deal  almost  exclusively  with  counting  processes,  most  of  the 
important  results  holding  for  such  processes  carry  over  to  point  processes  and  the 
more  general  class  of  marked  point  processes.  In  order  to  take  marked  point 
processes  into  account  and  also  to  use  these  more  general  processes  to  understand 
the  meaning  (limitations)  of  the  assumptions  characterizing  counting  processes, 
we  will  introduce  marked  point  processes  here  and  give  a  few  examples.  These 
processes  will  be  studied  in  more  detail  in  Chapter  4  and  again  at  the  end 
Chapter  6. 

We  let  Z  =  (Z{n),n>0)  be  an  arbitrary  sequence  of  random  variables  defined  on 
fl  and  taking  values  in  a  space  E;  let  (E,£)  be  a  measurable  space.  Then,  with 
(T( n ) )  as  above,  the  double  sequence  (T(n),Z(n))  is  called  a  marked  point  pro¬ 
cess  and  E  is  called  the  mark  space. 

If  we  define  the  process  NA  =  (N^ t)Tt>0)  by 

NA(w,t)  :=  V  l[T(n)<t,Z(n)(A] 
n>  1 
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for  A  in  £.  Then  N(t,w)  :  =  NE(w,t)  is  the  point  process  introduced  earlier. 
The  mapping 


j 


A  ->  /i(w,|0,t]xA)  :=  NA(w,t) 


defines  a  random  measure  on  [0,oo)xE. 

This  random  measure  is  the  primary  object  of  study  in  the  classical  (sans  mar¬ 
tingales)  approach  to  point  processes.  This  is  exemplified  in  the  works  of  Kallen- 
berg[1976,1982]  and  Matthes,  et  al.  Recently,  work  has  started  to  appear  (Hoe- 
ven)  combining  martingale  and  random  measure  approaches.  The  first  significant 
modern  work  on  random  measures  by  the  martingale  community  is  Jacod  [1975]; 
we  will  return  to  this  paper  and  random  measures  in  Chapter  4. 

We  will  end  this  little  digression  with  some  examples  of  marked  point  processes: 

(a)  E  =  {1}.  Then  n(t,E)  is  just  our  original  point  process. 

(b)  E  =  {l,2,3,...,k},  then  Z  might  be  the  number  of  messages  arriving  at  a  com¬ 
puter  at  some  random  time,  T(n). 

(c)  Use  E  as  in  (b).  Bremaud  defines  the  multivariate  counting  process,  N  = 
(N(t),t>0)  by  setting 

N(t,W,i)  =  £  1([  T(n),oo)l(t,w)l[  Z(n)_i  ](W) 

n>0 

and  then  defining  N(t)  by  N(t)  ==  (N(t,l),...,N(t,k)). 

Naturally,  most  univariate  counting  process  results  carry  over  to  this  multivari¬ 
ate  process,  including  results  on  nonlinear  filtering.  We  will  not  utilize  this 
below.  In  applications  to  stochastic  networks  of  queues  it  plays  a  significant  role. 

(d)  This  example  is  really  about  counting  processes.  Just  note  that  when  E={1}, 
the  study  of  (T(n),n>0)  includes  the  study  of  renewal  processes  as  a  special  case, 
where  the  interoccurrence  times 

Sn+i  :=  T(n  +  1)-T(n) 
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are  assumed  to  be  independent  and  identically  distributed.  Hence,  it  includes  the 
model  for  life  testing,  for  example.  Note  that  the  counting  process  makes  no 
assumptions  of  interoccurrence  time  independence,  and  certainly  no  distributional 
assumptions. 

(e)  It  is  probably  clear  that  with  the  proper  assumptions  the  marked  point  pro¬ 
cess  (T( n ),Z( n ))  is  also  a  model  for  countable  state  Markov  processes!  A  charac¬ 
terization  of  such  processes  can  be  given  using  the  notion  of  “dual  previsible  pro¬ 
jection". 

Though  by  no  means  exhaustive,  these  examples  should  convince  the  reader  that 
point  processes  can  be  used  in  a  wide  variety  of  applications.  We  will  look  more 
carefully  at  one  particular  application  in  the  sequel. 

3.2.  Increasing  Processes  and  Lebesgue-Stieltjes  Stochastic  Integrals:  In 

Chapter  6,  where  the  major  properties  of  stochastic  integrals  with  respect  to  mar¬ 
tingales  are  developed,  we  require  some  elementary  facts  about  one  of  the  sim¬ 
plest  of  stochastic  integrals,  namely  those  involving  integration  with  respect  to 
processes  whose  paths  are  of  bounded  variation.  This  theory  alone  would  be 
sufficient  for  the  nonlinear  filtering  problem  if  we  were  able  to  restrict  our  prob¬ 
lems  to  those  dynamical  systems  where  the  state  process  was  of  bounded  varia¬ 
tion. 

We  will  assume  that  the  reader  recalls  the  definition  of  a  real  valued  function  of 
bounded  variation  defined  on  R.  It  is  sufficient  to  recall  that  every  such  function 
can  be  characterized  as  the  difference  of  two  non-decreasing  functions  (Also  see 
the  Odds  and  Ends  Appendix). 

3.2.1.  Definition:  An  F-adapted,  P-integrable,  nonnegative  process  A  = 
(A(t),t>0)  is  said  to  be  increasing  if  the  paths,  t— >A(t,w),  are  increasing  and 
right  continuous,  a.s.P  (satisfying  A(t)  <  oo,  a.s.P,  for  all  tfR+).  Note  that 
"increasing"  does  not  mean  “strictly"  increasing. 

Additionally,  A  is  said  to  be  integrable  if  A(oc)  =  lim  Aft),  which  always  exists. 

t  — CO 

is  P-integrable,  that  is,  if  EA(co)  <  -foe.  Then  EA(t)  <  -hoc,  for  all  t>0. 

3.2.2.  Remark,  Numerous  authors  talk  about  increasing  processes  on  the 
extended  real  line,  [0,oc],  and  not  wanting  to  exclude  “jumps"  at  oc,  write  the 
limit  of  Aft)  as  t  — ►  oo  as  A(oo-).  The  jump  at  infinity  is  then  just 
A(  oc )  -  A(oc-).  In  this  case,  when  A  is  defined  on  [0,oc],  it  is  said  to  be  integrable 
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3.2.3.  Remark:  By  definition,  an  increasing  process,  A,  has  increasing,  right  con¬ 
tinuous  paths,  a.s.P.  In  particular  then,  A  is  Skorokhod.  Further,  A  is  adapted, 
so  every  increasing  process  is  optional  relative  to  (F( t) ).  When  A  is  an  increasing 
process  relative  to  the  trivial  filtration,  F(t)  =  H  for  all  t>0,  A  is  said  to  be  a 
raw  increasing  process.  When  a  given  discussion  involves  a  nontrivial  filtra¬ 
tion,  and  we  want  to  talk  about  a  raw  increasing  process,  we  will  often  just  say  A 
is  an  increasing,  not  necessarily  adapted,  process.  These  distinctions  become 
important  in  applications  as  well  as  in  the  theory,  since  the  state  of  a  process  is 
“observable”  if  the  process  is  adapted  to  the  filtration  representing  the  observ¬ 
able  history. 

3.2.4.  Remark:  Denote  by  V+  =  V+(F,P)  the  family  of  equivalence  classes 
(under  indistinguishability)  of  increasing  processes.  Set  BV  :=  V+  -  V+.  Then 
BV  is  called  the  space  of  processes  of  bounded  variation,  or  finite  variation. 
In  particular,  elements  of  BV  have  the  property  that  almost  every  sample  path  of 
each  process  is  of  bounded  variation  on  compact  subsets  of  R+. 

Let  I\'+  be  that  subset  of  V+  consisting  of  increasing,  integrable  processes  and  IV 
be  the  set  of  differences  of  members  of  IV+  .  IV  is  then  called  the  space  of 

OO 

processes  of  integrable  variation.  AdV  implies  E/  |  dA(s)  |  <  oo. 

o 

3.2.5.  Let  X  =  (X(t),t>0)  be  a  measurable  process,  and  AeV+.  Then  with  each 
path,  t  — <■  A(t,w),  we  can  associate  a  Lebesgue-Stieltjes  integral 

t 

f  X(s,w)  dA(s,w)  :=  f  X(s,w)  dA(s,w) 

0  (0,t) 

where,  as  is  the  custom,  dA(s,w)  represents  the  measure  associated  with  A,  for 
each  w: 


dA((a,b  ],w):=A  (b,w)-A  (a,w) 


Now  let  X  be  a  measurable  process  such  that  E(  J  |  X(s)  |  |  d A( s )  |  )  is  finite. 

0 

Denote  the  family  of  such  processes,  X,  by  L^A).  Then  for  X<rL,( A)  the  pro¬ 
cess 
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I(t,w)  :=  JX(s,w)dA(s,w) 
o 

is  well  defined  up  to  a  set  of  P-measure  zero.  This  is  because  Fubini's  theorem 
guarantees  that 

t 

w— *  JX(s,w)dA(s,w) 
o 

is  H-measurabie  (See  Appendix  A  for  details).  Hence,  I(t)  is  a  random  variable 
for  each  t>0.  Indistinguishable  versions  of  the  process  I  =  ( I ( t ) ,  t>0)  are 
identified  and  the  resulting  equivalence  classes  are  denoted  by  X.A  = 
((X.A)(t).t>0)  and  called  the  (Lebesgue-Stieltjes)  stochastic  integral  of  X  rela¬ 
tive  to  A.  As  usual  we  have  suppressed  wtfl. 

3.2.6.  Let  XcL^A),  with  AtV+  .  Then,  the  process  ((X.A)(t),  t>0)  is  continuous 
on  the  right  (continuous,  if  A  is  continuous),  and  therefore  by  Lemma  2  of 
Chapter  2,  it  is  a  progressive  process.  Hence,  by  an  earlier  remark,  (X.A)(T)  is 
F(T)-measurable  for  each  stopping  time,  T. 

Also,  since  we  can  write, 


X.A  =  (X+).A  -  (X-).A  , 

X.A  is  the  difference  of  two  increasing  functions,  and  hence,  is  a  function  of 
bounded  variation. 


Further,  if  X  is  assumed  to  be  F-progressive,  then  since  A  is  F-adapted,  Fubini's 
theorem  tells  us  that  the  process  (X. A)( t)  is  F(t)-measurable  (F-adapted)  for  each 
t>0.  Hence,  in  this  case  X.A  is  an  optional  process  (  since  we  have  already  noted 
that  X.A  is  a  right  continuous  process  ). 


3.2.7.  Remark:  From  Chapter  2,  section  2.7,  we  know  that  if  A  is  an  increasing 
process,  and  therefore  Skorokhod  and  adapted,  there  exists  a  sequence  of  stop¬ 
ping  times  (T( n ) )  which  exhaust  the  jumps  of  A  and  have  the  same  measurability 
as  A.  Set  Ad(t)  =  V(A(T(n))  -  A(T( n )- ))  l[[T(n),oo))(L)-  Then  Ad  is  increasing 


and  Ar  :=  A  -  Ad  is  continuous  and  increasing.  Therefore,  Ar  is  previsible  and 


so  if  A  is  previsible,  then  Ad  is  also.  Finally,  the  decomposition  A  =  .V  4-  .V  is 


unique,  in  the  usual  sense. 


It  is  shown  in  Dellacherie  [1972]  that  A  can  be  written  in  the  form 


A  =  Ac  +  V  a(n)  1  [[T(n),co))’ 

n>0 

where  a(n)  >  0  for  all  n.  Ac  is  the  continuous  part  of  A,  and  the  process  Ad  is 
called  the  purely  discontinuous  part  of  A. 

It  follows  that 

X.A  =  X.(AC)  +  V]  a(n)  X(T(n))  l[[T(n),oo)) 

n>0 

and  so  X.A  is  previsible,  if  A  and  X  are  previsible  or  A  is  continuous  (since,  in 
the  latter  case,  X.A  is  continuous). 

This  equation  has  the  obvious  consequence  that  when  A  is  a  counting  process,  N 
=  (N(t),  t>0),  where  AN(t)  =  1  or  0  for  all  t, 

X.N  =  E  X(T(n))  l1[T(n))C0)) 

n  >0 


3.2.8.  The  following  Theorem  is  well  known  and  easily  proved.  It  was  stated  in 
Chapter  1  for  discrete  parameter  processes  and  will  be  extended,  in  Chapter  6,  to 
stochastic  integrals  with  local  martingale  integrators. 

3.2.9.  Theorem.' 

Let  A  and  B  be  two  Skorokhod  processes  in  BY.  Then  ABtBV  and 

t 

A(t)B(t)  -  A(0)B(0)  =  / (A(s)  dB(s)  +  B(s-)  dA(s))  ,  (3) 

o 


and 


A(t)B(t)  = 


J 

|o.t| 


[ A(s)  dB(s)  +  B(s-)  d A(s) 
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for  t~>0.  That  is,  in  (3) 


A(t)B(t)  -  A(0)B(0)  =  (A  .  B)(t)  +  (B_  A)(t)  , 
where  B_(t)  :=  B(t-)  :=  lim  B(s)  . 

S  — t- 

3.2.10.  Remark:  Equation  (4)  is  the  correct  analogue  to  the  Chapter  1  integra¬ 
tion  by  parts  formula. 

3.2.11.  Remark:  The  last  equation  is  often  written  in  the  “differential''  form 
d(AB)  =  AdB  -I-  B_dA.  Of  course,  this  has  meaning  only  through  the  Theorem. 

As  in  discrete  case,  the  integration  by  parts  equation  can  be  rewritten  in  a  more 
symmetric  form, 

3.2.12.  Corollary: 

d|AB)  =  A_dB  +  B_dA  +  d[A,B],  (5) 

where,  the  square  brackets,  or  cross  quadratic  variation  process  is  given  by 

[A,B](t ):=  V  AA(t)AB(t), 

0<s<  t 

where  the  summation  is  taken  over  the  countable  number  of  common  discontinui¬ 
ties  of  the  bounded  variation  processes,  A  and  B,  and  A  A(t)  :=  A(t)  -  A(t-) 

Clearly,  the  equation  for  d(AB)  is  obtained  from  the  Theorem  by  noticing  that 
the  Lebesgue  Stieltjes  integral,  (A  -  A_).B,  is  just  [A,B]. 

The  importance  of  the  representation  for  d(AB)  above  will  be  recognized  when  it 
is  demonstrated  that  the  natural  integrands  for  the  stochastic  integrals  defined 
bHow  are  previsible  processes  (  and  for  instance  A_  is  previsible). 

3.2.13.  Remark:  Recall  that  any  martingale  can  be  taken  to  be  Skorokhod.  If 
our  filtration  F=(F(t),t>0)  is  quasi-left  continuous,  then  any  F-martingale.  of 
BV.  is  continuous  at  previsible  times: 


M(T)  =  E(  M(T)  |  F(T)  )  =  E(  M(T)  |  F(T-)  )  =  M(T-). 


Since  under  quasi  left  continuity,  accessible  times  are  previsible,  we  have  that 
(BV)  martingales  can  jump  only  at  totally  inaccessible  times.  Therefore,  under 
this  condition  on  the  filtration,  integration  with  respect  to  martingales  of 
bounded  variation  is  even  permissible  in  the  Riemann-Stieltjes  sense  when  the 
integrands  are  continuous  at  totally  inaccessible  times.  For  example,  when  the 
integrand  only  jumps  at  previsible  times,  as  in  the  case  of  Skorokhod  previsible 
processes  ( Jacod,1979). 

3.2.14.  Remark:  Liptser  and  Shiryayev[1978,Vol  2,  p261]  give  a  very  informative 
example.  Considering  the  LS  integral  of  a  Poisson  process  relative  to  the  centered 
Poisson  process,  they  demonstrate  that  (N.M)(t)  is  not  a  martingale,  but  that 
(N_.M)(t)  is  one,  where  N_(t)=N(t-)  and  M(t)  =  N(t)  -  ct,  t>0,  c  the  Poisson 
parameter  of  N.  Notice  that  N_(t)  is  previsible,  because  it  is  left  continuous. 

3.2.15.  Remark:  The  following  result  is  proved  by  Doleans  and  Meyerf  1970. p. 89). 


3.2.16.  Theorem: 

If  X  is  an  F-previsible  process,,  M  is  an  F-adapted  martingale  which  belongs  to  IV 
and  X  e  L \(M),  then  (X.M )(t),  t>0,  is  an  F-martingale. 


We  have  seen  the  analogous  result  for  martingale  transforms  in  Chapter  1.  More 
such  results  will  be  seen  in  Chapter  6  as  stochastic  integrals  are  extended  to 
wider  and  wider  classes  of  integrators.  Moreover,  these  stochastic  integrals  will 
agree  with  the  Lebesgue  Stieltjes  stochastic  integral  when  the  “integrator”  mar¬ 
tingale  is  taken  to  be  a  member  of  IV. 


3.2.17.  Remark:  It  is  easy  to  show  that  M  is  a  F-martingale  iff  E(X.M)(t)  =  0, 
for  all  F-previsible  kernels,  X  =  lBx(s,t]>  an(*  BfF(s).  Bremaud  [1981]  points 
this  out  and  observes  that  this  is  just  one  of  the  many  reasons  why  previsible 
processes  play  a  central  role  in  the  theory  of  stochastic  integration.  As  we 
proceed  we  will  meet  numerous  other  instances  to  support  this  position. 
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Chapter  -4.  Dual  Previsible  and  Previsible  Projections 

4.1.  Introduction:  As  observed  earlier  in  these  notes,  measurable  processes  are 
not  necessarily  adapted.  If  X  is  such  a  process  (i.e.,  measurable,  not  adapted  ) 
and  the  filtration  F =( F( t) )  is  interpreted  as  the  history  of  observation  connected 
with  some  experiment,  then  path  segments  of  X,  X[0  t|  :=  (X(s).  0<s<t  ),  are 
not  observable  outcomes  of  this  experiment  for  every  time  t.  Therefore,  if  it  is 
appropriate  to  derive  information  about  the  evolution  of  X=(X(t()  over  a  time 
interval  [0,t]  from  this  experiment,  this  information  must  be  estimated  with  the 
aid  of  the  history  of  outcomes  of  the  experiment.  One  such  estimate  is  X(t.w)  :  — 
E(X(t)  |  F(t))(w),  a.s.P.  If  one  intends  to  use  X  to  estimate  the  path  segments. 
Xj0t|  .  however,  then  one  is  faced  with  the  seemingly  impossible  task  of  pasting 
together  an  uncountable  number  of  the  versions  of  X(s)  to  obtain  X(s,w')  for  all 
s<t  and  all  w  in  some  set  K  with  the  property  that  P(K)  =  1.  The  results  of 
this  section  show  that  this  can  be  accomplished  uniquely,  provided  that  the 
estimating  process  is  carried  out  at  optional  or  previsible  times. 

In  order  to  look  at  the  results  of  this  section  from  another  direction,  suppose  that 
the  process  X  is  adapted  to  (F(t)).  Then  it  is  well  known  that  X  is  determined  by 
(F(t ))  through  E{  X(t)  |  F(t)  )  =  X(t),  a.s.P  (X(t)  =  X(t)  a.s.P).  That  is,  when  X 
is  F(t)-adapted,  X(t)  is  determined  by  the  integrals  E{  X(t)  1A  },  for  all  AeF(t). 
Results  of  the  first  part  of  this  section  show  that  previsible  processes  X  are 
uniquely  determined  by  the  P  -  integrals  of  XT  on  [T  <  oo],  where  T  is  previsi¬ 
ble. 

These  two  observations  concern  “previsible  (optional)  projections”  of  a  stochastic 
process.  The  majority  of  results  of  this  section  concern  the  “dual  previsible  pro 
jection”  of  a  process.  This  projection  concerns  increasing  processes  and  it  plays  a 
fundamental  role  in  the  calculus  of  martingales.  The  dual  previsible  projection 
will  be  defined  in  terms  of  previsible  projections  and  “admissible”  measures,  the 
latter  coming  up  next. 

We  will  assume  throughout  this  section  that  the  underlying  filtration 
F  =  (F(t),t>0)  satisfies  the  “usual  conditions”. 

4.2.  Measures  Generated  by  Increasing  Processes:  Let  / i  be  a  measure  on 
a  sub  a  algebra  G  of  B([0,oc))XH,  where  (H.H.F.P)  is  the  underlying  filtered  pro¬ 
bability  space  with  H  =  u[  |^J  F( t))  :=  F(oo). 

t>0 

We  will  follow  Metivier  and  call  //  admissible,  if  for  BfG  and  B  evanescent. 
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//(B)  =  0.  This  is  similar  to  absolute  continuity  of  measures:  Let  11(B)  be  the 
projection  of  B  onto  fi.  Then  //  is  admissible  if  P(LI(B))  =  0  implies  //(B)  =  0 

As  an  example  of  such  a  measure,  let  A  be  an  increasing,  integrable  (A(oc)/L,(P)l 
process  and  set 

//(C)  =  E{  /  lc(s.w)dA(s,w)}  ,  (1) 

(0,oc) 

where  CfG  =  B(R+)xH.  This  measure,  //,  is  admissible  and  bounded  on 
B([0,oo))xH. 

With  A  as  above,  define  // A,  by  setting 

/iA(X)  =  E{  /  X(s)dA(s)}  (2) 

(0,oo) 

on  the  space  of  bounded  measurable  processes  X.  Then  //A  is  a  linear  functional 
on  this  space.  Observe  that /iA(lc)  =  /i(C). 

It  is  easy  to  show  (  e.g.,  first  use  simple  processes,  then  pass  to  the  limit)  that 

MX)  =  /  Xd/i  (3) 

R+x  fi 

for  X  measurable  and  bounded  (or  X  positive)  and  /i  as  in  (1).  In  a  common 
abuse  of  the  language,  both  p  and  /iA  are  often  referred  to  as  measures,  /iA  as  the 

measure  generated  by  A. 

4.2.1.  Remark:  Later  in  this  Chapter,  measures  generated  by  increasing 
processes  will  be  characterized  and  used  to  introduce  and  study  the  notion  of 
“dual  previsible  projection”.  Prior  to  this  development,  such  measures  together 
with  previsible  projections  will  be  used  to  state  a  criterion  for  the  previsibility  of 
raw  increasing  processes. 

4.2.2.  Now  take  //  to  be  as  defined  in  (1)  and  let  T  be  an  F-Optional  time.  Set 
(s,w)  — »  A(s,w)  :  =  1([t,co))(s'w)-  Then  A  jumps  at  [[Tj],  the  graph  of  T,  and  is 
equal  to  l  to  the  “right”  of  ([Tj],  that  is,  on  [(T,co)). 


Therefore, 


/  lc  (s,w)dA(s,w)  =  lc  (T(w),w)  l[T<00j  (w), 

[0 ,  x,  . 


It  follows  that 


/i(C)  =  Jlc  (T(w),w)  l(T<00j(w)  P(dw)  =  E{  1C(T)  l,T<oc|  } 
n 


and  so,  for  any  bounded  measurable  process  X, 

/  X  d/I  =  E{Xt  1|t<co1}.  (-J) 

r+x  n 

where  we  have  written  X-p  in  place  of  X(T).  This  is  most  easily  seen  by  first  tak¬ 
ing  X  to  be  a  simple  process  and  then  passing  to  the  limit.  For  example, 

X  :=  ic  ,  where  X(s,w)  =  ak  on  Ck  ,  (Ck)  a  finite  partition  of  B(R+)Xfi- 
Then 

/Xd/i  =  /V)a k  lCk  d/i  =  V)ok  /i(Ck)  =  E{V)aK  lCk(T)  1[t<co)  }=eI  xt  1[T<cc]  }■ 

With  A  =  [[(Too))  as  in  the  beginning  of  this  paragraph,  denote  the  admissible 
measure  n  by  Hj. 


The  following  Theorem  establishes  a  mapping  of  bounded  measurable  processes 
into  previsible  processes.  This  mapping  behaves  much  as  a  conditional  expecta¬ 
tion  operator. 


4.2.3.  Theorem  :(Afe  tivier,l 982) 

For  every  bounded  measurable  process,  X,  there  exists  a  unique,  previsible  pro¬ 
cess,  PX.  such  that 


fx  d/iT  =  f  pX  dpT,  (5) 

u  u 

where  U  =  R+Xfl,  for  every  previsible  time,  T. 

4.3.  Previsible  Projections:  The  process,  PX,  is  called  the  previsible  pro¬ 
jection  of  X  onto  G(PT),  the  F-er-algebra  of  previsible  events.  By  equation  (4), 
this  defining  equation  is  equivalent  to  the  requirement  that 

E{  XT1[T<00|}  =  E{  pXTl|T<00j}  (6) 

for  every  previsible  stopping  time,  T.  This  equation  in  turn  is  equivalent  to 

E{  XT  l[T<oo)  I  F(T-)  }  =  PXx  l[T<oo)‘  (7) 


a.s.P. 

4.3.1.  Remark:  The  proof  of  this  last  statement  is  quite  easy.  The  trick  is  to  take 
the  previsible  time,  T,  to  be  the  restriction,  Tq,  to  any  set  C  in  F(T-).  By  the  last 
result  in  Section  2.6.5  Tc  is  previsible  and  the  previous  equation  for  the  previsi¬ 
ble  projection  applies.  Since  Tc  takes  the  value  oo  off  the  set  C,  a  moment’s 
thought  gives  this  equation  in  the  form 


E{  XT1[T<00|1C  }  =  E{  PXT1[X<00|1C  }. 

Then  using  the  fact  that  pXxl|T<co]  is  F(T-)-measurable,  the  result  follows  from 
the  definition  of  conditional  expectation  and  the  arbitrariness  of  C  in  F(T-). 

4.4.  Section  Theorems:  The  proof  of  the  Metivier  Theorem  itself,  however, 
relies  on  one  of  the  deeper  parts  of  the  general  theory  of  stochastic  processes, 
namely,  the  so-called  Section  Theorems.  These  are  the  result  of  applying  the 
Theory  of  Choquet  Capacity  and  Analytic  Sets  to  measure  theory.  Of  course  this 
theory  will  not  be  discussed  here,  but  to  establish  some  frame  of  reference  for  the 


projection  theory  of  this  section,  we  will  state  one  of  the  Section  Theorems  and 
two  results  that  follow  from  this  theorem.  To  some  extent,  this  will  further  jus¬ 
tify  some  of  the  grandiose  claims  about  stopping  times  made  in  the  introduction 
to  Chapter  2. 

4.4.1.  Theorem  (Section  Theorem): 

If  (n,H,(F(t),t>0),P)  is  a  filtered  probability  space  and  the  random  set  A  is 
optional,  then  given  any  e  >  0,  there  exists  a  optional  time,  T,  such  that 

(a)  ([Tj]  is  contained  in  A,  and 

(b)  e  +  P{w  :  T(w)  <  oo  }  >  P{  n(A)  }  , 

where  A  — »  11(A)  is  the  projection  map  ofR+xCl  onto  Cl.  Further ,  if  A  is  previsi- 
ble,  then  T  can  be  taken  to  be  previsible. 

The  following  is  immediate 

4.4.2.  Corollary  (Dellacherie,  1072): 

Let  A'  and  Y  be  optional  (previsible)  processes.  Then  X  and  Y  are  indistinguish¬ 
able  iff  X(T)  —  Y(T),  a.s.P,  for  any  optional  (previsible)  time. 

The  proof  given  by  Dellacherie  will  be  paraphrased  here  because  it  is  simple  and 
indicates  why  the  Section  Theorems  are  important:  Let  A  =  {  (t,w)  :  X(t,w)  7^ 
Y(t,w)  }.  Assume  that  optional  X  and  Y  are  not  indistinguishable,  then  A  is  not 
evanescent.  Then  there  exists  an  optional  time  T,  whose  graph  is  contained  in  A 
and  which  is  not  evanescent  (by  (a)  and  (b)  of  the  theorem).  Hence,  X(T(w),w)  7^ 
Y(T(w),w)  on  an  event  with  positive  probability.  That  is,  X(T)  =  Y(T),  a.s.P, 
implies  that  X  and  Y  are  indistinguishable.  Conversely,  if  X  and  Y  are  indistin¬ 
guishable,  then  P(I1(A))  =  0  so  that  X(T)  =  Y(T),  a.s.P,  for  all  optional  times 
T. 

A  second  application  proves  the  uniqueness  statement  in  Metivier's  theorem  on 
the  existence  of  previsible  projections: 
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4.4.3.  Corollary. 

Let  X  and  Y  be  previsible  bounded  (or  positive )  processes.  If  for  each  previsible 
time  T  one  has 


E{  Xxl|T<00j}  —  E{  YTl|T<00j},  (8) 

then  the  processes  X  and  Y  are  indistinguishable. 

The  proof  of  is  similar  to  that  of  the  last  Corollary. 

Once  the  uniqueness  of  the  previsible  projection  is  shown,  a  monotone  class  argu¬ 
ment  centering  on  processes  of  the  form:  X  :=  If x (s,t]>  with  F  in  F(s)  for  s<t,  is 
used  to  show  the  existence  of  previsible  projections.  Previsible  projections  for 
such  processes  will  be  given  below  in  the  Examples  subsection. 

On  the  way  to  proving  the  existence  of  previsible  projections,  Metivier  proves 
that 


P(ZX)  =  ZPX 


for  all  bounded  previsible  processes  Z.  This  gives  another  important  property  of 
previsible  projections  and  one  which  again  suggests  that  they  behave  like  condi¬ 
tional  expectations. 

Letting  X  be  a  bounded  measurable  process,  we  briefly  note  several  properties 
of  previsible  projections: 

(a)  If  X  is  a  previsible  process,  then  PX  =  X; 

(b)  The  mapping  X  — *■  PX  is  linear; 

(c)  If  (Xn)  is  an  increasing  sequence  of  bounded  measurable  processes,  then  the 
previsible  projection  of  the  supremum  of  the  sequence  is  the  supremum  of  the 
projections; 

(d)  If  X  is  left  continuous,  then  its  previsible  projection  is  left  continuous. 


4.5.  Optional  Projections:  The  optional  projection,  °X.  of  a  bounded 


measurable  process  X  also  exist,  are  unique,  and  satisfy  (see  equation  (7)) 

E{  XT  l[T<oc]  I  F(T)  }  =  °XT  l[T<co]>  (9) 

a.s.P.  (See  Dellacherie,  1972  and  Dellacherie  and  Meyer,  1981.)  Thus,  as  with  the 
previsible  projection,  the  optional  projection  can  be  written  as  a  conditional 
expectation,  but  with  the  conditioning  algebra  equal  to  F(T),  rather  than  F(T-). 

The  properties  listed  above  for  the  previsible  projections  have  obvious  counter¬ 
parts  in  the  optional  case.  The  following  properties  also  hold: 

(e)  p  X  =  p  (  °X). 

(f)  X<Y,  a.s.P,  implies  °X<  °  V,  PX<  p  Y. 

The  last  property  says  that  optional  and  previsible  projections  are  order  preserv¬ 
ing.  The  next  property  says  that  optional  and  previsible  projections  are  not  very 
different. 

(g)  ft  sections,  Bw,  of  the  random  set,  B  =  {  °X^  PX  },  are  countable  for  all 

t  ft.  This  means  that  on  any  path  of  a  process  X,  its  optional  and  previsible  pro¬ 
jections  differ  at  only  a  countable  number  of  time  points.  In  general,  random 
sets,  C,  subsets  of  R+  X  ft,  whose  sections,  Cw  =  {  t>0  :  (t,ui)fC  },  are  count¬ 
able  for  each  w  are  said  to  be  thin  or  “mince”  in  French  literature.  Therefore. 
B  =  {  °X  7^  PX  }  is  a  thin  random  set.  In  the  case  of  the  Poisson  process  see 
example  (3)  immediately  following. 

(h)  An  earlier  remark,  characterizing  the  previsibility  of  increasing  processes,  has 
the  following  analogue  under  optionality:  A  is  an  optional  increasing  process  iff 
for  all  bounded  measurable  processes  X,  pA(X)  =  nA(  °X). 

4.5.1.  Examples: 

(1)  X  of  the  form  X=Z  l((r,s]j  ,  where  r<s  are  positive  real  numbers  and  Z  is  a 
bounded  measurable  function. 

Optional  case:  Set  Y(t)  =  E{  Z  |  F(t)  }.  Y  can  be  and  is  chosen  to  be  a  right 
continuous  modi^.cation  having  left  limits.  Since  Y  is  adapted  it  is  then  optional. 
Therefore, 


E{  XT  l[T<x]  |  F(T)  }  =  E{  l((r,sj,(T)  7  |  F(T)  }  =  1(M,(T)  V(T). 


for  any  optional  time  T,  where  Y(T)  =  E(Z|F(T))  by  Doob’s  Optional  Stopping 
Theorem.  Hence.  0  X  =  l((r,s|]  Y. 

Previsible  case:  Take  Y  as  before  and  let  T  be  a  previsible  time.  Let  ( T(  n ) )  be 
a  sequence  of  optional  times  announcing  T.  Then  Y(T(n))  =  E{  Z  |  F(T( n ) )  } 

and  Y(T-)  =  lim  Y(T(n))  =  E{  Z  |  a(  M  F(T(n))  )  }  a.s.P,  (V.T8  Dellacherie. 

n— *oo 

1972).  But  F(T-)  =  <r(  (J  F(T( n ) )  ),  (III,  T39.b  Ibid).  Therefore,  Y(T-)  =  E{  Z  | 

F(T-)  ),  for  previsible  T.  Finally,  since  T  and  Y(T-)  are  F(T-)-measurable,  it  fol¬ 
lows  as  in  the  last  case  that  PX  =  l((r,s|)Y_- 

(2)  Let  S  be  a  totally  inaccessible  time  and  set  X  =  lr[Sjj.  Then  X(T(w),w)  = 
l[(S|](T(w),w)  =  1  [t— s)( w)  a.s.P,  for  any  previsible  time,  T.  Therefore,  E(  XT 
l[T<x)|  )  =  E  1  (t— S<co)-  The  latter  quantity  equals  zero  a.s.P,  by  definition  of 
total  inaccessibility.  Hence,  by  the  first  Corollary  to  the  Section  Theorem,  we 
then  have  that  PX  is  evanescent.  Setting  all  the  details  aside,  this  should  be 
intuitively  clear  from  the  definition  of  total  inaccessibility  and  any  reasonable 
interpretation  of  projection. 

(3)  Let  X  be  a  Poisson  process  with  parameter  c>0,  so  that  X  is  optional  (it  is 
right  continuous)  and,  consequently,  °X  =  X.  Then  PX=X_  ,  where  X_(t)  = 
X(t-),  t>0.  This  example  can  be  used  to  illustrate  the  idea  of  thin  random  sets 
defined  in  the  previous  section.  Notice  that  the  sections  Bw  of 
B  =  {  °X  7^  PX  }  are  just  Bw  =  {  Tn(w)  :  n  >  0  },  where  (  Tn  )  is  the 
sequence  of  jump  times  of  the  process  X. 

4.6.  Dual  Previsible  Projections:  Consider  the  measure,  /iA,  defined  earlier  in 
this  Chapter  by  setting  /iA(X)  =  E{(X.A)(oo)}  for  all  bounded  (or  positive) 
measurable  functions  X,  where  A  was  an  increasing  process.  As  noted,  /iA  is 
called  the  measure  “generated”  by  the  process  A.  Let  X  be  any  positive  measur¬ 
able  process,  define  another  measure  m(X)  :=  E{(  pX.A)(oo)  }  and  ask  if  there  is 
a  nondecreasing  processes  A  with  the  property  that  m(X)  =  /*A(X).  This  ques¬ 
tion  is  the  same  as  interpreting  jiA(X)  as  an  ordered  scalar  product  <X,A>  and 
asking  about  the  dual,  A,  of  PX,  in  the  sense  that  <  PX,A>  =  <X,A>. 

We  will  drop  the  subscript  A  on  /iA  for  a  while,  but  retain  the  above  definition. 
Dellacherie  shows  that  if  /ip  is  defined  by  setting  /jp(X)  :=  /<  (  p  X)  for  every 
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positive  measurable  process  X,  then  //p  is  a  cr-finite  measure  on  B(R+)  X  Ffoc). 
The  corresponding  unique  (  up  to  ind istin guishabilitv  )  increasing  process  gen¬ 
erating  this  measure  is  denoted  by  Ap  and  called  the  dual  previsible  projec¬ 
tion  of  the  process  A.  The  measure  /xp,  is  referred  to  as  the  dual  previsible 
projection  of  the  measure  p.  (E.g.,  Metivier,  and  Hoeven.) 

4.6.1.  Remark:  Assume  the  usual  conditions  on  the  underlying  filtered  probabil¬ 
ity  space  (fi,H,(F(t)),P).  Dellacherie[l972,  IV  T41,  T42]  gives  the  following  char¬ 
acterization  of  measures  generated  by  increasing  processes: 

4.6.2.  Theorem: 

.4  a -finite  measure  p  on  (R+XH,  B(R+)XH  is  generated  by  an  integrable, 
increasing  (not  necessarily  adapted)  process,  A,  iff 

(a)  A*([[0]])  =  0  and  /i([[0,t]])  <  oo,  tfR+, 

(b)  p  is  P-admissible. 

Then  A  is  unique  up  to  P-indistinguishability. 

Further,  A  is  adapted  iff 

(c)  /i([0,t]xB)  =  /r(E(lB  |  Ft)  l|[0,t]]) 
for  all  tcR+  and  BfH. 

4.6.3.  Remarks:  Recall  that  to  avoid  some  complications  in  exposition  we  have 
assumed  as  part  of  the  definition  of  increasing  process  in  Chapter  3  that  A(0)=0. 
This  is  Dellacherie’s  assumption  also,  but  Jacod  [1979]  does  not  make  this 
assumption  here,  nor  do  Dellacherie  and  Meyer  [1980].  These  latter  works  also 
do  not  assume  that  /i([[0]])  =  0,  but  that  the  measure  has  finite  mass.  As 
defined  in  the  beginning  of  this  Chapter,  condition  (b)  just  says  that  p  assigns 
zero  measure  to  evanescent  random  sets. 

4.6.4.  Remarks:  The  reader  will  notice  that  as  we  come  to  the  end  of  this  note, 
more  proofs  will  be  given.  This  is  especially  true  in  Chapter  6.  For  a  number  of 
reasons,  we  choose  to  prove  the  present  Theorem: 

Suppose  then  that  p  is  generated  by  an  increasing  process  as  specified  in  the 
statement  of  the  theorem.  We  first  note  that  the  finiteness  of  p  in  (a)  follows 
from  the  integrability  of  A.  The  admissibility  of  p  is  obvious,  since  P(B)=0 


km  in  iM  iPiin  ip  wvrwm 


implies  that  IXB,  where  I  is  an  interval,  is  evanescent.  Therefore,  all  that 
remains  of  the  necessity  portion  of  the  proof  is  condition  (c).  We  observe  first 
that  A  is  F-adapted  iff  for  all  BcH 

EIbA.  =  E(E(lBAt|Ft))  =  E(E(1b  |  Ft)At) 

for  all  tcR+.  This  is,  iff  A  is  orthogonal  to  all  r.v.s  of  the  form  1B  -  E(1B  |  Ft). 
But  the  last  equation  is  just  condition  (c),  since  when  ft  =  ftA  is  generated  by  A 
and  A  is  a  adapted 

00 

/iA([0,t]xB)  =  E(/lj|0 t|j(s,.)lB(s)  dA, 
o 

=  E(lBAt)  =  E(E(1b  |  FJAt) 

00 

=  E(/  E(1b  I  Ft)  l{[o,t]]  dAJ 
o 


=  A<a(E(1b  I  Et)  l[[o,t]])- 


Conversely,  if  the  three  conditions  are  satisfied,  then  for  all  BtH  define 

Qt(B)  :=  //([O.tlxB). 

Then  Q0(B)  =  0  for  all  B  and  Qt  is  a  bounded  measure  for  all  t>0.  Admissibil¬ 
ity  of  ft  shows  that  Q  is  absolutely  continuous  with  respect  to  P  on  (Q,H).  Let 


A'  be  defined  by  setting  At  = 


dQt 

dP 


,  a.s.P,  the  Radon-Nikodym  derivative  of 


Qt  relative  to  P.  Then  Ag  =  0,  a.s.P,  and  At  is  P-integrable.  Since  ft  is  a 
positive  measure,  A,  <  A*  ,  if  s<t. 


By  Lebesgue’s  Monotone  Convergence  Theorem,  At’  =  lim  A^  in  L^P),  where 

n—oo  " 

(tn)  is  any  seqence  decreasing  to  t.  It  follows  that  the  convergence  is  also  almost 
sure  (P).  Hence,  we  can  define  the  process  A  as  a  right  continuous,  increasing, 
modification  of  the  process  A  by  setting  At  :=  inf{Ar  :  r  rational  and  >  t  } 


With  this  A,  the  measure  generated  by  A  satisfies 

t 

PAdMxB)  =  E/lBdA,  =  ElBAt. 
o 

But  since  A  is  a  modification  of  a!  ,  At  is,  a.s.P,  the  Radon-Nikodym  derivative 
of  Qt  relative  to  P.  Hence, 
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Therefore, 


t 

EIb^i  =  /^b^Qs  —  A*([0.t]xB). 

o 

/iA([0,t]xB)  =  /i([0,t]xB) 

for  all  teR+  and  BeH.  It  follows  that  pA  =  ft,  since  the  sets  of  the  form 
[0,t]xB  are  generators  for  the  product  <r-algebra,  B(R+)XH. 

Finally,  we  have  already  verified  that  A  is  adapted  by  condition  (c).  Uniqueness 
follows  by  noting  that  if  G  is  another  increasing  process  generating  ft,  then 
Gt  =  At,  a.s.P,  for  each  t,  and  G  is  a  modification  of  A.  Since  G  and  A  are 
right  continuous,  Lemma  2.3.3  guarantees  that  they  are  indistinguishable. 

4.6.5.  Remark:  Let  X  be  any  positive,  measurable  process  and  set 

/t(X)  =  E(  pX.A0O), 

where  A  is  a,  not  necessarily  adapted,  integrable,  increasing  process.  Because  of 
the  properties  of  linearity,  monotonicity  and  continuity  of  previsible  projections, 
ft  is  a  cr-finite  measure  on  B(R+)XH.  Therefore,  by  the  last  theorem,  there  exists 
a  unique  increasing  process,  denoted  Ap,  which  generates  ft.  Hence,  from  the  last 
equation 

EX.AJ>  =  E(  pX.A0O). 

We  need  the  following  lemma  to  conclude  that  the  process  Ap  is  previsible: 

4.6.6.  Lemma:  (Dellacherie[1972,  V  T26])  An  integrable,  increasing  process, 
A,  i3  previsible  iff  for  any  two  positive,  measurable  processes  A  and  Y  with  the 
same  previsible  projection  ftA(X)  =  ftA{Y) 

But  if  X  and  Y  have  the  same  previsible  projections  then  from  the  preceding  con¬ 
struction 

EX.Ap  =  E(pX.A,)  =  E(pY.AJ  =  EY.A£. 

Hence,  the  Lemna  shows  that  .V’  is  a  previsible  process.  Therefore,  we  have  the 

following 

4.6.7.  Theorem:(Dellacherie.  1972. p  107) 

Let  A  be  an  integrable,  increasing,  not  necessarily  adapted,  process  with 
A(0)  =  0.  For  each  positive  measurable  process,  A,  there  exists  a  unique 


\ 

I 


I 

I 

I 


previsible,  increasing  process,  Ap,  called  the  dual  previsible  projection  of  A. 
such  that 


CO  CO 

E{  /’X,dA,}  =  E{  /X,  dA/  ). 


0 


0 


(10) 


4.6.8.  Remark:  Bremaud  (1981)  and  Meyer  (1973)  state  this  result  in  a  slightly 
different  form  which  will  be  useful  later  on: 

Let  A  be  an  integrable,  increasing  process  with  A(0)  =  0.  Then  there  exists  a 
unique  (to  indistinguishability)  an  integrable,  previsible,  increasing  process,  Ap. 
such  that  A p(0)  =  0  a.s.P  and 


CO  00 

E{  f  C(s)d  Ap  (s)  }  =  E{  f  C(s)dA(s)  } 
o  o 

for  all  non-negative,  previsible  processes,  (C(s), s>0). 

As  indicated  in  this  result  (with  C  =  1),  the  duals  of  integrable  processes  are 
themselves  integrable.  However,  it  may  be  shown  that  the  dual  projections  of 
increasing  bounded  processes  are  not  necessarily  bounded. 

The  following  strengthens  the  definition  of  the  dual  previsible  projection: 

4.6.9.  Theorem: 

Let  S,T  be  F-stopping  times,  with  S<T,  and  A  an  integrable,  increasing  process. 
Then 


T 


pX(t)  d  A(t)  |  F(S)  }  =  E{  J  X(t)  d  Ap(t)  |  F(S)  } 

s 


(11) 


for  any  bounded  (or  positive),  measurable  process  A^. 

4.6.10.  Remark:  The  proof  follows  from  the  definition  of  conditional  expectation 
and  the  definition  of  dual  previsible  projection.  Let  C’cF(S),  and  set  Sc  and  Tc 
equal  to  the  restrictions  of  S  and  T  to  the  event  C.  Then  we  know  that  the  sto¬ 
chastic  interval,  ((Sc,Tc]j,  is  previsible.  Hence,  from  the  properties  of  previsible 
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projections,  we  have 

^(SoTdl  PX  =  P<  l((SoTc]|  X  )» 

and  from  the  definition  of  Ap,  the  dual  previsible  projection  of  A,  we  have 

0a(  P(  ^(ScTcll  X  )  )  =  /*AP  (  ^(ScTcl!  X  )• 

Therefore, 

/*a(  ^(SoTcll  PX  )  =  ^A”  (  ^(ScTdl  X  )  )• 

This  last  equation  is  the  same  as 

T  T 

E{  lc  J  PX  dA  }  =  E{  lc/XdAP}. 
s  s 

Since  CeF(S)  is  arbitrary,  the  last  equation  is  equivalent  to  the  statement  of  the 
theorem. 

4.6.11.  Definition:  Two  raw  increasing  processes,  A  and  B,  having  the  same 
dual  previsible  projection  are  said  to  be  associated.  If  A  and  B  are  associated, 
then  we  write  ApB. 

4.6.12.  Remark:  Dellacherie  shows  that  each  equivalence  class  determined  by  the 
relation  p  contains  one  and  only  one  previsible  increasing  process. 

4.6.13.  Remark:  We  now  set  down  some  results  whose  main  object  is  to  charac¬ 
terize  adapted  associated  processes.  This  characterization  will  be  extended  by 
“localization"  in  Chapter  6. 

4.6.14.  Theorem: 

Let  AcIV0.  Then  the  following  statements  are  equivalent: 

(a)  A  is  a  martingale; 

(b)  Ap  is  evanescent; 

(c)  vanishes  on  previsible  random  sets. 

Remark:  There  is  very  little  to  |  rove.  First  consider  the  equivalence  of  (a)  and 
(c):  Let  s<t  and  BeFs.  Then  it  is  easy  to  see  that 

/iA(  ((sb^b)]  )  =  E(1b  (At  -  As)).  Recalling  the  generators  of  G(PT),  it  is  clear 
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that  this  equation  entails  the  equivalence  of  (a)  and  (c).  Let  S  and  T  be  stopping 
times.  S<T;  then  ((S,T]j  is  previsible.  The  equivalence  of  (b)  and  (c)  follows  in  a 
manner  similar  to  the  last  case  by  noting  that  pA(  ((S,T]]  )  =  ftAP{  ((S,T]J  ). 
(We  have  left  off  some  details  in  the  two  pairs  of  equivalences  concerning  the  gen¬ 
erators  {0}xB  and  [[0B]j,  B£F0,  respectively,  but  these  are  easy.) 

The  following  Corollary  is  the  desired  characterization: 

4.6.15.  Corollary: 

Increasing ,  integrable  processes  A  and  B  are  associated  iff  the  process  M  =  A  -  B 
is  a  martingale. 

Although  the  implication  (b)  implies  (a)  gives  the  necessity  of  this  Corollary,  it  is 
instructive  to  use  the  previous  Theorem  4.6.9.  The  necessity  of  the  Corollary  fol¬ 
lows  from  the  previous  Theorem  by  setting  X  equal  to  1  and  using  the  fact  that 
A  and  B  have  the  same  dual  previsible  projections.  This  yields 

E(A(t)-A(s)  |  F(s)  )  =  E  (  B(t)  -  B(s)  |  F(s)  ) 

for  all  real  numbers  s  and  t,  with  s  <  t.  Since  A  and  B  are  adapted  (part  of  the 
definition  of  increasing  process),  it  follows  that,  A  -  B  is  a  martingale. 

Conversely,  A  -  BdV0  and  A  -  B  is  a  martingale.  The  last  Theorem  tells  us 
that  0  =  (A  -  B)p  ,  so  that  linearity  gives  Ap  =  Bp.  Therefore,  A/?B. 

4.6.16.  Definition:  Let  A  be  an  integrable,  increasing  process.  (Hence  A  is 
adapted.)  The  dual  previsible  projection  of  A  is  called  the  (previsible)  compen¬ 
sator  cf  A,  and  is  denoted  by  A. 

4.6.17.  Remark:  The  previsible  compensator  of  A  is  that  previsible  process  that 
must  be  subtracted  from  A  to  obtain  a  martingale. 

4.6.18.  Remark:  In  the  Chapter  on  martingale  transforms,  we  noticed  that  in 
discrete  time,  if  an  increasing  previsible  process  was  a  martingale,  then  it  was 
a.s.P  constant  (  and  equal  to  zero  if  it  took  the  value  zero  at  the  origin  ).  A  simi¬ 
lar  remark  can  be  made  for  the  continuous  time  analogue.  Again  this  follows 
immediately  from  the  last  Theorem. 

A  direct  proof  repeats  part  of  the  proof  of  Theorem  4.6.14,  perhaps  more  care¬ 
fully.  The  argument  goes  as  follows:  If  A  is  an  integrable,  increasing  process 


9M 


I 


which  is  also  a  martingale,  then  E(  lD(A(t)  -  A(s) )  |  F(s)  )  =  0,  for  all  D  in  F(s). 
But  this  says  that  the  measure  generated  by  A  vanishes  on  events  of  the  form 
1  (s.t].<D ’  s5S-t  an^  D  in  F(s).  It  is  also  obvious  that  this  measure  vanishes  on 
{0}xD.  where  DeF (0).  Since  these  events  are  generators  of  the  er-algebra  of  F- 
previsible  events,  the  measure  generated  by  A  vanishes  on  the  entire  F-previsible 
algebra.  Therefore,  /iAP(X)  =  pA(  PX)  =  0,  for  all  bounded,  measurable  X.  It 
follows  that  Ap  is  evanescent.  But  if  A  is  previsible,  then  A  =  Ap,  and  so  A  is 
evanescent.  Therefore, 

4.6.19.  Theorem:  Integrable,  increasing,  previsible  martingales  are  evanescent. 

4.6.20.  Remark:  This  will  be  extended  to  local  martingales  in  Chapter  6. 

4.6.21.  In  the  section  on  Lebesgue-Stieltjes  stochastic  integrals  we  have  noted 
that  if  XcLt(A)  is  positive,  then  X.A  is  an  increasing  process.  It  is  natural  to  con¬ 
sider  the  dual  previsible  projection  of  X.A  when  either  X  or  A  is  previsible.  Del- 
lacherie,  V  T31,  1972,  shows  that 

(1)  If  A  is  an  increasing  previsible  process,  then  (X.A)P  =  PX.A,- 

(2)  If  X  is  a  positive,  previsible,  Lj{A)  process,  then  (X.A)P  =  XAP. 

4.6.22.  Remark:  We  will  prove  the  second  proposition  using  the  ordered  scalar 
product  notation  introduced  above:  Let  Y  be  a  positive,  measurable  process,  then 

<Y,(X.A)P>  =  <YP,(X.A)>  =  <YPX,A>  = 

<(YX)P,A>  =  <YX,Ap>  =  <Y,X.AP>. 

4.6.23.  Remark:  Dellacherie,  1972,  discusses  the  notion  of  absolute  continuity  of 
increasing  processes.  This  is  of  some  importance  in  the  analysis  of  counting 
processes.  Let  A  and  B  be  (raw)  increasing  processes.  A  is  said  to  be  absolutely 
continuous  relative  to  B  if  Y.B  =  0  implies  Y.A  =  0  for  all  positive  measurable 
processes,  Y.  If  p  and  X  are  the  measures  generated  by  B  and  A,  respectively, 
then  this  is  the  same  as  saying  that  X  is  absolutely  continuous  relative  to  p. 
Thus,  if  f  is  the  Radon-Nikodvm  density  of  X  relative  to  p,  then,  using  the  nota¬ 
tion  from  the  beginning  of  this  Chapter, 

/  X  f  dp  =  /  X  dX 

u  u 


or.  equivalently, 


E(Xf.B)  =  E(X.(f.B))  =  E(X.A) 

for  all  bounded  measurable  processes  X.  The  unicity  (up  to  indistinguishability) 
of  the  generating  processes  then  implies  that  A  =  f.B,  where  by  definition,  f  is  a 
positive  measurable  process  in  Lj(B).  When  A  and  B  are  both  previsible, 
A  =  Ap  =  ( f.B)p  =  pf.B.  Therefore,  if  all  the  assumptions  of  this  paragraph 
hold  and  A  and  B  are  previsible,  then  there  exists  a  previsible  process  gfLjiR) 
such  that  A  =  g.B.  That  is,  f  is  previsible. 

4.6.24.  Remark:  In  nonlinear  filtering  of  point  processes,  N,  an  important  class 
of  problems  is  covered  by  the  case  where  A,  the  dual  previsible  projection  of  N,  is 
absolutely  continuous  relative  to  the  deterministic  process,  B(t,w)  =  t,  a.s.P.  In 
this  case,  f  is  called  the  intensity  of  the  point  process  N.  More  precisely,  recal¬ 
ling  that  we  have  suppressed  the  underlying  filtration,  (F(t),t>0),  and  recogniz¬ 
ing  that  the  dual  previsible  projection  depends  strongly  on  its  filtration,  and  the 
underlying  probability,  P,  f  is  called  the  F-intensity,  or  the  (P,F)- intensity  of 
N.  The  intensity  is,  in  general,  a  previsible  stochastic  process. 

4.6.25.  Remark:  Now  let’s  cover  the  last  paragraph  again  from  a  different  start¬ 
ing  point.  We  return  to  Theorem  4.6.9.,  with  X  =  1,  and  take  B  to  be  the  dual 
previsible  projection  of  A.  If  it  is  further  assumed  that  B  is  absolutely  continuous 
relative  to  Lebesgue  measure,  with  density  X,  then  as  defined  earlier,  X  is  the  F- 
intensity  of  A  and  satisfies 

t 

E  (  A(t)  -  A(s)  )  |  F(s) )  —  E  (  f\( y)  dy  |  F(s)  ). 

S 

This  equation  becomes  extremely  important  when  A  is  a  counting  process. 
Bremaud.  1981,  is  concerned  almost  exclusively  with  this  case  and  the  sections 
below  on  nonlinear  filtering  will  deal  mostly  with  this  case,  following  Bremaud. 
[1978,79,80,81].  For  now  we  just  consider  the  simple  example  when  A  is  a  one 
jump  counting  process:  A(t)  =  1  jt < t] »  w'th  T  an  F-stopping  time.  With  this 
definition  of  A  the  last  equation  becomes 

t 

P  (  s  <  T  <  t  |  F(s)  )  =  E  (  /  X  (y)  dy  |  F(s)  ). 
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Using  the  “little  o”  notation,  this  statement  has  historically  been  written  some¬ 
what  less  exactly  as 

P  (  s  <  T  <  t  |  F(s)  )  =  X  (s)  (t  -  s)  +  o(  t  -  s  ),  (t— s  +  ). 

To  justify  this  statement  in  a  simple  case,  assume  that  X  is  right  continuous  and 
use  the  identity 

t  t 

/ X  (y)  dy  =  X  (s)  (t  -  s)  +  (  t  -  s  ){  — i — -  /  (  X  (, )  -  X  (s)  )  dy  }. 

(  1  "  s  )  s 


4.7  Random  Measures  and  Jacod’s  Formula:  In  this  section  we  will  intro 
duce  a  useful  formula  for  calculating  the  dual  previsible  projection  of  a  point  pro¬ 
cess.  The  derivation  of  this  formula  in  its  most  general  form  is  due  to  Jacod  in 
his  1975  paper  on  multivariate  (marked)  point  process.  (The  origin  of  the  for¬ 
mula  is  contained  in  the  paper  of  Delacherie  [1970]  which  considers  a  point  pro¬ 
cess  with  a  single  jump.  Also  see  Brown  [1978]  for  a  short  proof  of  the  formula  in 
the  case  of  simple  point  processes.)  Although  the  results  of  Jacod's  paper  are 
extremely  important  and  go  far  beyond  just  the  formula  and,  as  pointed  out  by 
Jacod,  reading  the  paper  does  not  require  an  enormous  technical  background  we 
will  not  attempt  to  give  a  digest  of  its  contents.  Our  goal  is  just  to  introduce 
Jacod’s  “hazard”  function  formula  for  the  compensator  of  a  marked  point  pro¬ 
cess,  to  do  this  without  proofs,  but  with  enough  preliminary  explanation  to  allow 
one  to  understand  why  the  formula  holds.  To  accomplish  this  we  will  first  show 
how  to  develop  a  discrete  parameter  version  of  the  Jacod  formula  in  the  case  of  a 
simple  (unmarked)  point  processes.  Then  we  will  suggest  its  continuous  time 
analogue,  recall  (and  extend)  the  concept  of  a  random  measure  from  Chapter  3 
and  state  the  general  Jacod  formula  together  with  some  useful  special  cases. 
Examples  of  the  use  of  the  formula  are  given  in  Chap’er  5. 

Recall  the  discussion  and  notation  of  Section  1.10  on  discrete  point  processes, 

n 

(Nn,Fn,neZ+).  So  Nn  =  V  Xk  with  the  Xk  being  0-1  Bernoulli  random  vari- 

k=0 

ables.  As  in  Chapter  1,  let  X  =  (Xk)  be  th*1  F-intensity  of  the  point  process 
(Xk).  Thus, 

Xk  -  E(Xk  |  F k_ ! )  -  P(Xk  =  l  |  Fk_j), 
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Paralleling  the  assumptions  of  Jaeod  we  take  the  filter  F  to  be  the  internal  his¬ 
tory  of  the  discrete  point  process  X.  Recall  the  definition  of  the  “jump”  times 
(Tn)  of  the  counting  process  N.  Then 

Nm  =  £ 

n>  1 

It  is  clear  that  this  is  a  finite  sum  since  the  stopping  times  (Tn)  are  integer 
valued.  Bu*  we  will  continue  the  practice  of  writing  it  as  an  infinite  sum. 

We  prove  the  following  formula  for  the  intensity  of  the  discrete  point  process. 
(Convention:  If  a  and  (3  are  two  functions  and  /?(w)  =  0  implies  that  a(w)  =  0, 

then  it  is  natural  to  define  the  quotient  —  as  zero  whenever  (3  vanishes.) 

4.7.1.  Theorem: 

Under  the  assumptions  stated  above 

vp  P(Tn  =  k  |  FtJ  . 


,  _  yr  pci  n  =  *  |  fTb j  t 
k  Z-/  P(Tn  >  k  [  Ft  ) 

n>1  '  n  —  '  *«-i; 


where  FTj  =  ^(T,,  •  •  •  ,Tj)- 


First  note  that 


so  that 


Xu  =  ANt  =  Vl 


[T„-kJ> 


Xk  =  E(Xk  |  Fk_,)  =  V  P(Tn  =  k  |  Fk_j). 

n>  I 

The  following  relation  holds  for  the  trace  <r-algebra  on  [Tn_j  <  k-1  <  Tn]: 


Fk-lD(Tn-l  <  k-1  <  Tn]  =  F^ri^n-I  <  k-1  <  TnJ. 

Observe  that  [Tn_j  <  k-1  <  Tn]cFk_j  since  the  T’s  are  F-stopping  times  and 
[Tn_,  <  k-1  <  TJ  =  [Tn_,  <  k-1  <  k  <  TJ, 
since  the  stopping  times  are  integer  valued. 

It  follows  from  the  last  equality  that 

[T„  =  k]  =  [Xk  =  l|n[T„_,  <  k-1  <  k  <  T,|. 


We  need  the  following  variation  of  Bayes  Theorem: 


4.7.2.  Lemma  (Brown  [1978)): 

Let  (Q.H.P)  be  a  probability  space.  If  G  and  K  are  sub  c-algebras  of  H,  B«H. 
CfG  and 


i 

!  Then 


cnc  =  Knc. 


P(BnC|G)  = 

on  C  and  =  0.  on  the  complement  of 
P(C  |  K)  0  on  C. 


PfBnC  |  K) 

P(C  |  K) 

C,  where  P(C  |  K)  is  a  version  with 


Remark:  Brown  does  not  give  a  proof  of  this  Lemma,  but  it  follows  easily  from 
the  “quotient  rule”  for  Radon-Nykodym  derivatives  by  paying  careful  attention 
to  the  use  of  the  restrictions  of  P  to  the  various  sub  <r-algebras  involved  in  the 
hypotheses. 


Using  this  Lemma,  which  applies  due  to  (13)  through  (14),  we  have 


P(Tn  =  k  |  Fk_,) 


P(Tn  =  k  [  FtJ 
P(Tn  >  k  |  FT  J  I|T-Tn))- 


and  consequently  formula  (12). 


Having  obtained  formula  (12)  for  the  “first  difference”  of  the  compensator  of  a 
discrete  point  process,  it  is  natural  of  conjecture  that  the  compensator  A  of  a 
point  process  N  —  (Nt,Ft,t>0),  where  Ft  =  er(Ns,s<t)  and  N(t)  =  V  1[Tn<t)- 

n>  1 


satisfies 


A(dt)  = 


SP(Tnfdt  |  FtJ  ^ 
n^1  P(Tn  >  t  |  FXJ 


(15) 


This  equation  does  indeed  hold  and  occurs  in  various  forms  (e.g..  Brown  [1078], 
Liptser-Shiryayev  [1978])  and  in  numerous  applications  (e.g..  Jacobson  [1982],  Gill 
[1980]).  We  will  come  back  to  this  case  at  the  end  of  the  present  Section  where  it 
will  occur  as  a  consequence  of  the  general  Jacod  formula.  For  this  purpose  we 
need  to  recall  the  concept  of  marked  point  processes  and  random  measures  which 
were  mentioned  briefly  in  Chapter  3. 
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4.7.3.  Remark:  Let  (Tn,ncZ+)  be  a  point  process  relative  to  the  probability  space 
(ft.H.P):  (Tn)  is  an  increasing  sequence  of  random  variables  on  ft  with  values  in 
R+  and  such  that  Tn  <  Tn+,  on  [Tn  <  oc].  Set  Nt  =  S^[T„<t|- 

n> 


Let  (E,£)  be  a  measurable  space  and  (Zn,n(Z+)  a  sequence  of  random  variables  on 
ft  with  values  in  a  space  E.  Hence,  the  Zn  are  H-measurable  relative  to  £.  (Zn)  is 
called  the  sequence  of  marks  and  E  the  mark  space.  E  is  assumed  to  have  the 
structure  of  a  Borel  subset  of  a  complete  metric  space.  (This  assumption  is 
sufficient  for  the  existence  of  regular  conditional  distributions  of  random  variables 
with  values  in  E.  (Shiryayev  [1984].)  The  usual  applications  will  have  E  =  Rn, 

y 

for  some  natural  number  n,  or  E  =  R  +.) 

The  definition  of  the  range  of  the  sequence  of  marks  is  extended  in  the  following 
way:  Let  f  be  some  point  exterior  to  E  and  define  Zn(w)  —  <;  iff  Tn(w)  =  oo 
(the  nth  event  “never  occurs”).  To  understand  why  this  extension  is  made  see 
Jacod[1979,  p.74].  Finally,  let  Z0(e)  =  <5  for  all  e  in  E  and  T0  =  0. 
Sn+1  =  Tn+1  -  Tn,  for  n>0.  The  sequence  (Tn,Zn)  is  called  a  marked  point 
process. 

Let 

NtA  =  S  1  (Tn<  t]  l|Z„fAl- 
n  >  1 

Then  NA  counts  the  number  of  times  jumps  of  N  have  marks  in  A. 

Set  Ef  =  E(j{f},  E  =  (O.oo)xE,  Ef  =  E(kj{oo,f}  and  ft  =  ftX[0.oc)xE. 
With  an  analogous  meaning,  let  be  the  usual  cr-algebras  on  Ef,  E,  Ef, 

respectively.  (E.g.,  £  =  B((0,oo))  X  £.) 

Let  (ft.H,F,P)  be  a  filtered  probability  space  with  the  filtration  F  =  (Ft.t>0). 
Then  if  n  =  n(F)  denotes  the  (7-algebra  of  F-previsible  subsets  of  ftx[0.oc). 
set  II  :=  11(F)  =  nxe  where  F  =  (Ft,t>0). 

With  this  structure,  we  call  the  family  ft  =  {/<(w..):wfft}  of  nonnegative  func¬ 
tions  a  random  measure  on  (E,£)  if  /i  is  a  positive  transition  measure  (Appen¬ 
dix  A).  That  is,  if 

(a)  w— >/j(w,A)  is  H-measurable  for  each  Ac£,  and 


(b)  A— ►//(w.A)  is  a  positive  a  finite  measure  for  each  wtft. 


Further,  a  random  measure  p  is  said  to  be  an  integer  valued  measure  if 


(c)  the  mapping  w— *p(w,A)t Z+,  for  each  Ac£,  and 

(d)  p(w,{t}xE)<l,  for  all  w tQ  and  t>0. 

Let  p  be  a  random  measure  on  (E,£).  If  W  =  (W(w,t,z),w£0,t>0,Z6E)  is  a  non¬ 
negative  HxB((0,oo))X£  measurable  function  on  Q,  set 

W*pt(w)  :=  f  W(w,s,z)p(w,ds,dz). 

(0,tjxE 

By  denoting  the  Radon-Nykodym  derivative  of  W(w,  .  )  relative  to  the  measure 
p(  w,.)  by  Wop,  we  can  write  the  last  definition  in  the  form 

W*/it(w)  :=  (Wop)(w,(0,t]xE). 

A  random  measure  tj  is  said  to  be  F-previsible  if  for  each  positive  Id-measurable 
process  X,  the  process  X*//t  is  F-previsible. 

The  marked  point  process  (Tn,Zn,nfZ+)  is  completely  determinded  by  the  random 
measure  p  defined  on  (E,£)  by  setting 

MW,B)  =  E  lB(Tn(w).Zn(w))llTn<00,  (16) 

n  >  1 

for  all  Be£.  We  will  often  refer  to  such  a  measure  as  a  point  process  measure 
or  the  random  measure  of  a  point  process. 

It  will  be  convenient  to  also  write  p  in  the  form 

p(w,dt,dz)  =  E  f(T„(w),znH)(dt,dz)  l[Xn<00],  (17) 

n  >  1 

where  ca  is  the  Dirac  measure  (  unit  mass  concentrated  )  at  the  point  a. 

Following  Jacod  [1975],  to  each  probability  measure  P  on  (Q,H)  and  point  process 
random  measure  p  we  associate  a  nonnegative  measure  on  (n,Id)  defined  by 
setting 

M„(W)  :=  E((W*p)TO) 

for  any  nonnegative  Id-measurable  function  W  on  Q.  Jacod  then  proves 
4.7.4.  Lemma: 

If  r)  is  a  random  measure  such  that  is  a- finite,  then  there  exists  a  unique  (  up 

to  a  P-null  set  )  F-previsible  random  measure  q  such  that  for  each  positive  X<II. 


M„(X)  =  M„(X). 
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Remark:  Comparing  this  result  to  Theorem  4.6.7.,  it  is  clearly  appropriate  to  call 
the  random  measure  i]  of  this  Lemma  the  dual  previsible  projection  of  the 
measure  t].  In  a  moment,  we  will  point  out  the  more  compelling  reason  that 
i)  -  t)  is,  in  a  natural  sense,  a  martingale. 

In  order  to  apply  this  Lemma  to  the  random  measure  p  of  a  marked  point  pro¬ 
cess  (refer  of  equation  (16)),  Jacod  shows  that.  is  a-finite  on  (Q.H).  Then  he 
obtains 


4.7.5.  Theorem: 


If  { Tn,Zn)  is  a  marked  point  process  and  p  is  given  by  (17),  then  there  exists  a 
unique  (up  to  modification  on  P-null  events)  F -previsible  random  measure  i'  such 
that  for  each  positive  previsible  process  X  (Xfll), 

E  f  X(t,z)  p(dt,dz)  =  E  f  X(t,z)  i/(dt,dz). 

(0,oo)xE  (0,co)xE 

(18) 

Jacod  then  uses  (18)  and  one  of  the  Section  Theorems  to  show 
previsible  projection  v  of  p  in  (18)  can  be  chosen  so  that 

that  the  dual 

M{t}xE)  <  1 

(19) 

and 

^[Too.°°)XE)  =  0. 

(20) 

Remark:  Set  Atc(w)  :=  i^w,(0,tjxC)  and  At  :=  AtE.  Then  A  is  the  compen¬ 
sator  of  NtE  =  Nt  =  l[T„<t]  an^  inequality  (19)  says  that  the  jumps  of  A 

n>  1 

have  magnitude  not  exceeding  one:  0<AAt<l.  Equation  (20)  says  that  A  does 
not  charge  the  random  set  [T^.oo). 

In  order  to  emphasize  the  connection  between  this  and  earlier  results  of  the 
Chapter  (when  E  is  a  singleton  set),  we  note  that  the  dual  previsible  projection  of 
the  marked  point  process  measure  p  is  characterized  by  (19),  and  (20)  together 
with  the  requirements  that 

(i)  the  process  (i/((0,t]  X  B),t>0)  is  previsible  for  each  Bt£, 
and 

(ii)  mt*n'  :=  p((0,tATn]  X  B)  -  id(0,tATn]  XB),  defines  a 
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uniformly  integrable  process  =  (m<]n*,t>0),  for  each  n 
>  0  and  Bf  £. 


4.7.6.  Remark:  Set  Gt  =  <r(NsA,s<t,Af  £).  It  can  be  shown  (e.g.,  Itim  [1980], 
Bremaud  [1981])  that  The  filtration  G  is  continuous  on  the  right  and 
Gs  =  ®<N£V„  t>0,  Ae£),  where  S  is  a  G-stopping  time  and  from  this  that  for 
n>  1, 

^*T„  —  er(Tk,Zk,l<k<n)  (21) 

and 

Gt,-  =  <r(Tk,Zk,T„,l<k<n-l). 

Now  if  we  take  into  account  the  probability  measure  P,  and  define  Ft,  t>0,  to  be 
the  smallest  er-algebra  generated  by  the  union  of  the  family,  T,  of  all  P-nuII  sets 
in  H  and  Gt,  then  the  family  F  retains  the  right  continuity  of  G.  Therefore,  since 

Ft  =  tr(ryGt)  (22) 

is  complete,  it  is  a  filtration  satisfying  the  “usual  conditions”. 

For  each  neZ+,  let  Kn(w,dt,dz)  be  a  version  of  the  regular  conditional  distribution 
of  (Sn,Zn)  given  FTn  i  and  Hn(w,dt)  =  Kn(w,dt,Ef),  the  conditional  distribution 
of  Sn.  Kn(w,.)  is  a  probability  on  Ef  while  Hn(w,.)  is  a  probability  on  (0,oo). 

We  can  now  state  the  Jacod  formula: 


4.7.7.  Theorem: 

With  the  filtration  F  =  (Ft,t>0)  given  by  equation  (22),  the  dual  previsible  pro¬ 
jection  v  of  the  random  measure  p  of  (17)  satisfies 


t^dt,dz) 


SKn(dt-Tn_„dz) 

Hn([t-Tn_|,oo])  1lT-.<t<TJ- 


4.7.8.  Remark:  Several  examples  of  marked  point  processes  were  given  in 
Chapter  3.  But  for  the  purposes  of  this  section  an  informative  example  to  keep 
in  mind  is  that  of  a  jump  process.  A  jump  process,  X  =  (Xn.t>0),  is  a 
Skorokhod  process  all  of  whose  paths  are  step  functions  (with  only  a  finite 
number  of  jumps  in  any  bounded  interval  of  time.  Appendix  A).  If  we  let  the 
sequence  (Tn,n>l)  denote  the  sequence  of  jump  times  of  such  a  processes  X  and 
(Z„,n>  1)  the  sequence  of  jump  sizes  of  X  at  these  jump  times,  Zn  :=  AXn,  then 
with  the  proper  conventions  at  time  0,  (Tn,Zn)  is  a  marked  point  process  and 


(24) 


Xt(w)  =  Xo(w)  +  V  Zn(w)  l|[Tnioo))(w>t)- 
n  >  1 

In  this  case  the  random  measure  /i  in  (17)  is  called  the  saltus  measure  or  jump 
measure  of  the  process  X. 


In  this  case,  with  the  filtration  as  in  (22),  the  Theorem  of  Jacod  shows  us  that 
the  dual  previsible  projection  u  of  the  saltus  measure  of  X  can  be  written  in  the 
form 


i^(dt,dz) 


E 

n>  I 


P(Tncdt,Znfdz  |  FtJ  ^ 

P(Tn  >  t  |  FtJ  1((T-T"11- 


20 


We  have  expressed  the  conditional  laws  of  (23)  in  terms  of  the  process  (Tn) 
instead  of  the  inter-occurrence  times  (Sn)  so  that  we  could  make  a  direct  com¬ 
parison  with  the  discrete  point  process  case  given  in  (12).  As  one  can  see.  (25) 
could  have  be  conjectured  from  (12). 


4.7.9.  Remark:  From  the  standpoint  of  the  creators  of  the  General  Theory  of 
Stochastic  Processes  and  existing  literature,  one  would  deduce  (12)  (similar 
remarks  would  hold  for  its  marked  analogue)  from  (23).  To  see  how  this  can  be 
accomplished,  we  will  use  the  notation  for  discrete  point  processes  given  at  the 
beginning  of  this  Section  and  let  ( J  denote  the  “greatest  integer”  function.  Since 
it  does  not  make  the  problem  more  difficult,  we  will  assume  in  our  discrete 
parameter  case  that  there  is  a  sequence  of  marks,  (Zn).  Thus,  we  start  with  the 
integer  valued  times  (Tn),  the  sequence  of  marks  (Zn)  and  the  filtration 
F  =  (Fn,n>0).  The  filtration  will  be  the  one  defined  by 

F  t„  =  <r(Tk,Zk,l<k<n,r)  (26) 

and 

fT„-  =  <7(Tk,Zk,Tn,l<k<n-l,r). 

Once  we  define  the  continuous  time  filtration,  this  is  enough  information  to  con¬ 
struct  the  random  measure  p  and  its  dual  previsible  projection  r,  as  well  as  the 
continuous  parameter  point  processes,  NB  =  (NtB,t>0)  and  N  :=  NE.  The 
continuous  parameter  filtration  F  can  be  defined  by  setting  Ft  :=  F|,j.  Since 
the  times  are  integers,  this  gives  in  particular  that  FTn  =  FT  . 

Thus  all  the  main  features  of  an  induced  continuous  parameter  marked  point 
process  have  been  defined.  For  example,  the  dual  previsible  projection  of  N  is 
At  :=  M(0,t]xE).  To  recover  (12)  from  (23),  just  define  the  F-intensity  by 
X|t]  :=  td([tJ}xE)  and  the  result  follows.  By  Jacod’s  Theorem  then.  0<\k<l. 


93 


Certainly  this  is  the  proper  route  to  (12).  But  from  the  standpoint  of  building  an 
intuition  and  gaining  the  interest  of  practitioners  from  other  fields  the  discrete 
parameter  approach  has  some  worth. 

4.7.10.  Remark:  We  will  close  this  Section  and  the  Chapter  by  stating  oft  used 
forms  of  Jacod's  formula  for  unmarked  point  processes.  Assume  that  the  filtra¬ 
tion  is  given  by  (22)  with  E  =  {!},  so  that  G  (in  (22))  takes  the  obvious  form.  If 
the  point  process  Nt  =  1  [Tn< t]  has  dual  previsible  projection  A  (with 

n>  1 

A0  =  0),  then  using  the  notation  of  Jacod’s  Theorem  (23)  we  have 
At  =  i/((0,t]),  so  that  equation  (15)  holds. 

Also,  in  the  unmarked  case,  Hn  =  Kn,  so  that  (assuming  the  point  process  is 
non-explosive)  we  can  write  the  compensator  in  terms  of  the  conditional  inter- 
occurrence  time  distributions  by  integrating  equation  (23)  over  (Tn_,(w),t],  for 
Tn_,<t<Tn(w)  and  then  making  a  change  of  variable  to  obtain. 

t  -T^i(w) 

At(w)  =  AT(Jw)  +  /  dKn(w,s)/(l-Kn(w,s-)),  (27) 

o 

when  Tn  i(w)<t<Tn(w),  n>l. 

Here  is  a  particular  example  of  (27):  Let  Kn(y)  =  1  -  e  XnX,  for  y  >  0,  zero  oth¬ 
erwise;  let  E  =  {1}.  Then,  from  (27), 

At  =  AXn_,  +  An(t  -  Tn_j), 

when  (t,w)c((Tn_j,Tn]].  This  yields  the  interesting  relationship 

ATn  -  At<wi  =  An(Tn  -  Tn_,), 

for  Markovian  systems. 

Formula  (27)  can  be  rewritten  in  terms  of  the  conditional  distribution  functions 
of  the  (Tn).  (So  can  (23),  of  course.)  For  this  purpose,  let 
Ln(w,s)  =  Kn(w,s-Tn_!(w)),  then  Ln  is  the  conditional  distribution  function  of 
Tn,  given  Fx^  and 

t  AT„ 

At  =  Ax^+  /  dLn(s)/(  1  Ln(s— )),  (28) 

on  [Tn  ,<t<Tn],  n>l.  Simple  direct  proofs  of  equation  (28)  without  the  aid  of 
(23)  can  be  found  in  T.  C.  Brown  [1978]  and  Liptser,  Shiryavev  [1978], 


Finally,  wp  point  out  that  if  the  point  process  N  has  an  F-intensity 


X  =  (Xs.s>0)  (i.e.,  if  A  is  absolutely  continuous  with  respect  to  Lebesgue  meas¬ 
ure,  with  X  as  the  Radon-Nikodvm  derivative),  then 


Xt(w)  = 


Sk(n+1>(w,t  -  TJw)) 

_  K(n+1>(w,[t-Tn(w),oo)) 


!((  T„,  Tn+1  l](w  t) 


(29) 


where  k'n+1*  is  the  conditional  density  of  Sn+1  give  FTn.  Hence,  we  have  an 
interpretation  of  the  intensity  as  a  conditional  hazard  function. 


Chapter  5.  Local  Martingales  and  Semi-Martingales 

5.1.  Local  Martingales:  The  important  concept  of  local  martingales  was  intro¬ 
duced  by  K.  Ito  and  S.  Watanabe  in  an  article  titled  Transformation  of  Markov 
Processes  by  Multiplicative  Functionals,  published  in  the  Annals  of  Institute  of 
Fourier  in  1965.  This  concept  provides  a  generalization  of  martingales  which  will 
be  used  to  extend  the  stochastic  integral  developed  in  Chapter  6  beyond  the  class 
of  square  integrable  martingales. 

5.1.1.  Definition:  An  adapted.  Skorokhod  process  M  is  said  to  be  an  F-local 
martingale  iff  there  exists  a  sequence  of  F-stopping  times,  (T(n),n>l),  increas¬ 
ing  to  oo  as  n— ♦  oc,  such  that  for  each  n,  mn  =  (MtT'n*,t>0)  is  a  uniformly 
integrable  F-martingale. 

We  also  introduce  the  term  F-local  Lp-martingale  as  a  process,  M,  for  which 
there  exists  a  sequence  of  F-stopping  times,  Sn  j  oo,  such  that  for  each  n. 
mn(t)  =  MS"(t),  t>0,  defines  an  Lp-martingale. 

5.1.2.  Remark:  The  notation  is  attempting  to  say  that  for  each  n.  the  process 
defined  by  t  — ♦  mn(t)  =  M(T(n)-t)  is  a  uniformly  integrable  martingale. 

The  sequence  (T(n))  is  called  the  localizing  sequence  of  the  local  martingale,  or 
of  the  local  Lp-martingale.  This  device  of  only  requiring  desirable  properties  such 
as  boundedness  and  integrabilitv  locally  (  on  stochastic  intervals  [[O.T(n)))  ). 
occurs  frequently  in  the  theory  of  martingales  and  will  be  discussed  at  some 
length  in  Chapter  6.  Relative  to  paths,  this  particular  form  of  localization  is  in 
the  same  spirit  as  truncation  of  functions  in  classical  analysis  and  probability 
theory,  with  the  further  qualification  that  it  is  intended  for  use  on  processes  that 
will  occur  as  integrators  in  (stochastic)  integrals.  Another  type  of  localization  by 
stopping  times  for  integrands  will  occur  in  Chapter  6.  In  the  study  of  mar¬ 
tingales.  localization  is  a  type  of  path-wise  truncation  that  is  mathematically 
tractable  because  of  the  Doob  Optional  Sampling  (Stopping)  Theorem. 

5.1.3.  Remark:  The  definition  states  that  (mn(t).  F(t),  t>0)  is  a  uniformly  integr¬ 
able  martingale  for  each  n>0.  This  can  be  proved  equivalent  to  the  same  require¬ 
ment  on  (mn(t),  F(T(n)~t),  t>0),  Kalianpur  [1980], 

5.1.4.  Remark:  Actually,  the  definition  of  a  local  martingale  does  not  have  to 
include  uniform  integrabilitv.  This  can  always  be  achieved  by  replacing  T(n)  with 
T(n)Ak.  for  some  fixed  k>0.  This  remark  is  made  to  highlight  what  is  really 
being  assumed.  In  this  spirit,  we  remark  that  if  X  is  a  bounded  local  F- 
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martingale,  then  X  ts  an  F-martingale. 

Just  notice  that  X(y~T(n))  — ►  X(y),  a.s.P,  y  >  0,  as  n  — ♦  oo,  and  since 
E(  X( t ' T( n ) )  |  F(s)  )  —  X(s~T(n)),  for  s < t,  the  boundedness  allows  us  to  pass 
to  the  limit  under  the  -cpectation  as  n  — ►  oo  to  obtain  E(  X(t)  |  F(s)  )  =  X(s), 
a.s.P. 

Similarly,  using  Fatou’s  Lemma  (the  liminf  part),  it  is  easy  to  show  that  a 
positive, local  martingale  is  a  positive  supermartingale. 

5.1.5.  Remark:  Every  martingale  is  a  local  martingale.  To  see  this,  take  T(n)=n 
and  let  M  be  a  martingale.  Then 

MlT(n)-t)  =  M(n-t)  =  E{M(n)  j  F(t-n)}  =  E{M(T(n))  j  F(t-T(n))}. 

It  follows,  from  the  characterization  of  uniform  integrability  given  in  Chapter  2. 
with  M(T(n))  —  Z(oo)  =Z(oc,n),  that  t— *Al(T(n)~t)  is  a  uniformly  integrable 
martingale. 


5.1.6.  Finally,  Chung  and  Williams  give  the  following  converse  of  sorts  to  the 
previous  observations. 

5.1.7.  Theorem. 

If  M  is  a  local  L ^-martingale  and  if  for  each  t>0,  {  [  Mt.-r(k)  I  }  ,s  uniformly 
integrable.  where  (T(k))  is  the  localizing  sequence  of  Af,  then  M  is  an  L  - 
martingale. 

5.1.8.  Remark:  We  observe  the  following  fact,  w'hich  will  explain  to  some  extent, 
the  Ito-Kunita-Watanabe  approach  to  stochastic  integration,  which  builds  on  the 
class  of  square  integrable  martingales.  Consider  an  almost  surely  continuous  mar¬ 
tingale,  m.  Define  the  sequence  (T(n)),  by  T(n)  :=  inf {  t  :  |m(t)|  >  n  },  and  = 
cc  .  if  {...}  =  empty,  for  each  positive  integer  n.  Each  T(n)  is  a  stopping  time  by- 
results  in  Chapter  2  concerning  debuts.  By  Doob’s  Optional  Sampling  Theorem, 
m(t~T(n))  is  a  martingale,  for  each  n,  and  |  m(t''T(n))|  <n.  for  all  t>0.  It  fol¬ 
lows  that  the  stopped  continuous  martingale  is  bounded  on  the  interval  [[O.T(n)]]. 
and  so  “square  integrable”,  in  a  sense  to  be  made  precise  in  Chapter  6.  There¬ 
fore,  once  the  stochastic  integral  has  been  defined  for  square  integrable  mar¬ 
tingales,  it  is  available  for  all  continuous  martingales  by  localization. 

It  should  be  noted  that  if  the  trajectories  of  m  are  not  continuous  then 
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m(t~T(n))  is  bounded  only  on  [[0,  T(n))).  We  have  no  idea  of  the  magnitude  of 
any  possible  jump  at  T(n).  Providing  for  this,  in  extensions  of  the  integral,  is  one 
of  the  difficult  issues  in  the  construction  of  a  stochastic  integration  theory  for 
arbitrary  local  martingales,  rather  than  just  for  continuous  local  martingales. 

5.1.9.  Having  hinted  at  one  use  of  localization  we  will  now  formally  state  and 
prove  a  result  (Chung-Williams[l983,  p. 2 1] )  that  is  required  for  the  work  in 
Chapter  6: 

5.1.10.  Lemma: 

Any  continuous  local  martingale  is  a  local  Lp-martingale  for  any  pc[l,oc] 

The  proof  of  this  technical  result  depends  strongly  on  the  previously  stated  "con¬ 
verse”  of  Chung  and  Williams,  so  the  idea  of  the  proof  is  to  figure  out  how  to 
stop  the  local  martingale  in  such  a  way  that  it  defines  a  sequence  of  uniform 'y 
integrable  local  martingales.  Let  m  be  the  continuous  local  martingale  with  local 
izing  sequence  (Tn)  and  set  Sn  =  inf {  t  >  0  :  |  m(t)  |  >  k  },  a  sequence  of 
stopping  times  (Chapter  2). 

Then  Rkn  :=  min(Sk,Tn  )  defines  a  double  sequence  of  stopping  times.  So 
Doob’s  Optional  Stopping  Theorem  tells  us  that  (m(t  *  Rkn  ))  is  a  double  sequence 
of  martingales.  Further,  by  definition  of  (Sn  ),  for  each  k,  this  sequence  of  mar¬ 
tingales  is  bounded  by  k  for  all  n.  Therefore,  mk  (t)  :=  m(t  -  Sk  )  defines  a 
martingale  for  each  fixed  k.  Hence,  (Sk)  is  a  localizing  sequence  for  m  such  that 
for  each  k,  mk  is  bounded  and  so  in  Lp  for  any  p>l. 

5.1.11.  Remark:  There  is  also  a  close  relationship  between  martingale  transforms 
and  local  martingales.  Let  X  be  adapted.  It  can  be  shown  that  X  is  a  (discrete 
time)  local  martingale  iff  X  is  the  transform  >f  a  martingale.(  Meyer  [1973]  )  It 
follows  that  if  X  is  a  P-integrable,  local  martingale  then  it  is  a  martingale.  (This 
is  not  true  in  continuous  time.)  So  a  local  martingale,  in  discrete  time,  is  not 
much  of  a  generalization  of  a  martingale. 

5.2.  Semi-Martingales:  We  have  encountered  the  concept  of  semi-martingale 
several  times  in  this  note.  We  can  now  give  a  general  definition  of  this  concept. 

5.2.1.  Definition:  A  Skorokhod  process,  X  =  (X(t),t>0),  is  called  a  semi¬ 
martingale  if  it  allows  the  following  decomposition: 


X(t)  =  X(0)  +  m( t )  T  A(t), 


where  m  is  an  F-local  martingale,  null  at  time  zero  and  A  is  a  process  of  bounded 
variation  (BY(F)j. 

5.2.2.  Remark:  In  the  last  section  of  this  chapter  we  will  give  a  number  of 
examples  to  illustrate  how  a  wide  variety  of  particular  processes  can  easily  be  put 
into  the  form  of  a  semi-martingale. 

Recall  the  Doob  Meyer  Decomposition  in  Chapter  1.  There  are  numerous  varieties 
of  this  decomposition  theorem.  This  particular  form  will  be  deduced  from  a  much 
more  restrictive  and  easily  proved  form  in  Chapter  6.  Indeed,  in  our  attempt  to 
construct  a  stochastic  integral  relative  to  semi-martingales,  we  will  spend  a  rela¬ 
tively  large  amount  effort  studying  semi-martingales  in  Chapter  6. 


5,2.3.  Theorem:  (Doob-Meyer  Decomposition) 

If  A  i.s  a  submartingale,  then  there  exists  a  unique  previsible  increasing  process  A, 
A(0)=0  and  a  local  martingale  M,  M(0)=0,  such  that 

X(t)  =  X(0)  +  M(t)  +  A(t). 


This  decomposition  is  unique  (a.s.P). 

5.2.1.  Remark:  A  is  the  previsible  compensator  of  X,  as  defined  in  the  Chapter 
on  dual  previsible  projections.  We  will  illustrate  the  Doob-Meyer  Decomposition 
with  counting  processes. 

We  first  note  that  since  a  counting  process,  N,  always  has  nondecreasing  sample 
paths,  it  is  a  submartingale.  It  follows  from  the  Decomposition  theorem  that 
there  exists  an  increasing,  F-previsible,  P-integrable,  process,  A,  with  A(0)=0, 
and  an  F-local  martingale,  m  with  mjO)  =  0,  such  that  N  =  m  -I-  A. 

5.2.5.  Theorem: 

Let  N  be  a  point  process  adapted  to  the  filtration  F  =  (F(t),  t  >  0).  Then  there 
exists  a  unique,  F-previsible,  increasing  process,  A,  with  A( 0)  =  0,  such  that 
N(t)  =  M(t)  4-  A(t),  where  M  is  an  F-local  martingale,  M(0)  =  0.  The  localiza¬ 
tion  sequence,  (Tn,n>l),  for  M  may  be  defined  by  setting 
Tn  :=  in f{ t  |  N(t)  >  n},  ^  0,  flnrf  =  oo,  o/Aerime. 
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It  can  bo  shown  [e.g.  Liptser  and  Shirvayov  vol  II]  that  A  is  continuous  iff  the 
counting  process,  N,  only  jumps  at  totally  inaccessible  times.  Since  applications 
in  this  note  will  concentrate  primarily  on  counting  processes  with  absolutely  con¬ 
tinuous  compensators,  it  follows  that  in  these  cases  the  counting  process  jump 
times  are  always  totally  inaccessible.  As  noted  earlier,  the  Poisson  process  is  such 
a  process.  Its  compensator  is,  of  course,  given  by  A(t)  =  Xt,  where  X>0. 

5.2.6.  We  now  give  some  examples  to  illustrate  the  dependence  of  A  on  the 
filtration,  F.  We  need  to  recall  the  Jacod  [1975]  formula  discussed  in  the  section 
on  dual  previsible  projections. 

Assume  that  the  filtration  is  the  internal  history,  the  a-algebra  generated  by 
the  counting  process,  N. 

5.2.7.  Example!  1):  Except  for  some  simple  modifications,  this  example  is  given 
in  Liptser  and  Shiryayev  [1978].  Suppose  that  X=(X(t),K(t),t>0)  is  an  adapted 
process  with  continuous  paths  and  (K(t))  satisfies  the  “usual  conditions".  Define 
T(n):=inf{t:  X(t)  >  l-(  1/n)  },  with  T(n,w)  :=  oo  if  {...}  is  empty.  Then  we 
know  from  Chapter  2  that  each  T(n)  is  an  K-optional  time.  Define  the  counting 
process,  N=(N(t),K(t)),  by  setting  N(t)  :=  l|T(oo)<t|-  Then,  since  (T(n))  increases 
to  T(oc)  and  the  sequence  is  optional,  we  see  that  T(oo)  is  a  previsible  time.  By 
definition  then,  N  is  also  previsible.  Hence,  in  the  Doob-Meyer  decomposition, 
the  previsibility  of  A  (and  so,  the  uniqueness  of  the  decomposition)  implies  that 
N=A.  (Any  process  which  is  indistinguishable  from  zero  is  certainly  a  mar¬ 
tingale.) 

Now,  changing  histories,  let  N  be  defined  as  before,  except  that  N=(N(t),0(t)), 
where  O(t)  is  the  sigma  algebra  generated  by  N.  Let  F  be  the  distribution  func¬ 
tion  of  T(oo)  :=  T,  and  suppose  that  1  -  F(s-)  >  0  on  [0,oo], 

Then,  using  the  Jacod  result  (see  the  section  on  Dual  Previsible  Projections), 

t-T(oo) 

Aft)  =  /  dF(s)/(  1— F(s— )). 

o 

Clearly,  A(t)  =  -ln(  1  -  F(t^T)  ),  t>0. 

Thus,  when  A  is  K-previsible,  A  is  the  two  valued  counting  process,  N.  but  in  the 
second  example  when  A  is  O-previsible,  1  -  exp(-A(t,w))  =  F(t~T(w))  . 


5.2.8.  Example) 2):  When  A  is  absolutely  continuous  relative  to  Lebesgue  meas¬ 
ure  and  F(t)  contains  O(t),  as  defined  in  the  last  example,  then, 
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with  history  (F( t) )  we  have 


A(t)  =  /r(s)ds. 

o 


with  history  (O(t))  we  have 


A(t)  =  /  r(s)ds, 


where  r(s)  :=  E{  r(s)  [  0(t)  }. 

5.3.  Examples  of  Semi-Martingales:  In  this  section  we  will  give  several 
examples  of  semi-martingales.  The  last  of  these  examples  will  demonstrate  a  pro¬ 
cedure  for  writing  a  function  of  a  point  process  as  a  sqmi-martingale. 

5.3.1.  Example)  1):  Let  N  be  a  counting  process  adapted  to  the  filtration  F.  By 
definition,  N  is  finite  for  every  t>0.  Assume  that  its  previsible  compensator  is 
absolutely  continuous  relative  to  Lebesgue  measure,  with  Radon-Nikodym  den¬ 
sity.  X,  the  F-intensity  of  N.  (See  the  section  on  previsible  projections  for  these 
definitions.)  Then, 


N(t)  =  /  X  )s)  ds  +  M(t), 


where  M  is  an  F-local  martingale  and  X  is  a  non-negative,  measurable  process. 

For  example,  when  X  is  a  constant,  then  (N,P)  is  the  Poisson  process  with  param¬ 
eter  X.  As  the  Poisson  process  is  the  baseline  counting  process,  both  historically 
and  usefully,  it  is  important  to  note  that  property  (*)  characterizes  this  process. 
That  is,  Watanabe  [1964]  proved  that  if  N(t)  -  tX  is  a  martingale,  then  (N,P)  is 
Poisson.  P.  Bremaud  [1975]  subsequently  showed  that  if  A  is  any  deterministic, 
right  continuous  increasing  mapping  of  (O.oc)  into  itself,  with  A(0)  =  0,  and  N  - 
A  is  a  (P.N)-local  martingale  then  (N,P)  is  a  generalized  Poisson  process  in  the 
sense  that  the  characteristic  function  of  (P.N)  is  given  by 
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E(  exp(iu(N(t)-N(s))  )  = 


=  n  {eiu  AA(V)  +  (1  -  A A( v ) }  exp[eiu  -  1]  (  Ac(t)  -  Ar(s)  ). 

5  <  V  <  t 


where  Ac  is  the  continuous  part  of  A.  Compare  this  and  equation  (*)  with  the 
Doubly  Stochastic  Bernoulli  process  of  Section  1.10.4. 

5.3.2.  Example(2):  A  sequence  of  sums  of  independently  distributed  random 
variables,  (Yk),  with  finite  expectation,  can  be  used  to  construct  a  sequence  of 
semi-martingales.  Let  ak  =  EYk  and  for  each  n>l,  define  Xn(t)  for  t>0,  by  set¬ 
ting 


Xn(t)  =  y  (  Yk  -  ak  )  +  y  ak  =  M(n,t)+B(n,t). 

k=l  k= 1 

Then  Xn  =  (Xn(t),t>0)  is  a  sequence  of  semi-martingales  . 

It  is  worth  noting  that  in  this  example  the  term,  B,  is  purely  deterministic  and. 
under  very  general  conditions,  for  large  n,  M  has  the  characteristic  properties  of 
integrated  noise. 

5.3.3.  Example(3):  Let 


X(t)  :=  /  f(s)  ds  +  M(t)  =  B(t)  +  M(t), 
o 

where  (M(t),F(t))  is  a  Wiener  Process  (Doob,  [1953]),  and  f(t)  is  an  F( t )- 

t 

measurable  process  such  that  E(  f  ]f(s)|  ds  )  <  oo,  for  each  t>0.  This  is  the 

o 

classical  model  for  integrated  signal  plus  noise. 

In  this  case,  X(t)  is  a  Wiener  process  with  drift  process  B(t)  (drift  rate  f). 
Further,  if  W  denotes  the  standard  Wiener  process,  and,  if  M(t)  is  the  Ito-integral 
of  g  relative  to  W  (see  the  section  on  Stochastic  Integration),  with  the  Lebesgue 
integral  of  g2  having  finite  expectation,  then  X(t)  is  a  Wiener  process  with  drift 
rate.  f.  and  diffusion  coefficient,  g. 


5.3.4.  Remark:  The  remaining  examples  in  this  section  illustrate  a  technique  for 


writing  functions  of  point  process  as  semi-martingales.  This  type  of  procedure 
will  be  extremely  useful  in  any  application  of  the  theory  to  nonlinear  filtering  of 
point  process. 


5.3.5.  Example(4):  Let  (N,P)  be  a  Poisson  process  with  parameter  c  and  define 
the  stochastic  processes  X  by  setting  X(t)  :=  exp(qN(t)),  for  every  t>0,  where  q 
is  a  fixed  positive  number.  Although  it  won’t  play  a  distinctive  role  in  this  exam¬ 
ple,  we  will  let  F={F(t))  denote  a  history  of  the  process  N  =  (N(t)).  This  exam¬ 
ple,  like  others,  will  be  used  again  in  this  note  as  we  illustrate  the  various  stages 
of  the  filtering  problem,  and  notation  will  be  carried  forward. 

Clearly, 

X(t)  =  X(0)  +  V  A  X(s). 

0<s<t 

Since,  X  jumps  at  a  point  s  only  when  N  jumps  at  s,  and  then  N(s)  =  N(s-)  +  1. 
we  can  write 

1 

AX(s)  =  2  X(s-)  e2  sinh(-2-). 

at  jump  points.  So,  since  X(0)  =  1,  we  have 

q. 

X(t)  =  1  +  e2sinh(-3-)  V)  X(s-)AN(s) 

-  0<s<t 

Then  we  can  write  X  in  the  form 

t 

X(t)  =  1  +  /  c  X(s)ds  +  M(t)  =  X(0)  +  B(t)  +  M(t). 
o 

where  X(.s)  :=  2(exp( q/2) )sinh( q/2)X(s- ),  and 

t 

M(t)  :=  /  X(s)  d(N(s)  -  «)). 
o 
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Since  the  compensated  point  process,  N’(t)  -  ct,  is  a  martingale,  and  X  is  previsi- 
ble,  it  follows  from  the  theory  of  Lebesgue-Stieltjes  stochastic  integration  that 
M(t)  is  a  martingale.  Thus,  as  B  is  a  process  of  integrable  variation,  X  is  a  semi¬ 
martingale. 

5.3.6.  Example(5):  (Bremaud  [1977],  [1981])  Consider  a  queue  with  the  number 
of  messages  (customers)  arriving  during  the  interval  [0,t]  denoted  by  a(t)  and  the 
number  of  departures  during  this  time  period  denoted  by  d(t).  Let  q(t)  be  the 
number  of  messages  waiting  for  service  (processing)  or  being  served  at  time  t. 
Assume  that  q(0)  is  a  positive  random  variable.  Set  q(t)  =  q(0)  +  a(t)  -  d ( t )  for 
all  t  >  0.  Assume  that  Aa(t)Ad(t)  =  0  for  all  t>0  (i.e..  a  and  b  have  no  jumps 
in  common).  By  definition  q(t)  >  0.  for  all  t>0.  Let  z(t.w,n)  =  l(q(t,w)=n): 
we  will  use  this  example  to  determine  the  conditional  distribution  of  q  given 
observations  on  the  number  of  arrivals.  This  is  because 


E  (  z(t)  |  F(t)  )  =  P(  q(t)  =  n  |  F(t)  ). 

As  in  the  previous  example,  we  begin  by  writing 

z(t)  =  z(0)  +  v;  Az(s)  —  z(0)  +  v  Az(s)Aajs)  +  V  Azfs)Ad(s) 

0<s<t  0<s<t  0<?<t 

Fix  n>l;  then  if  s  is  a  point  of  increase  of  a,  (that  is.  if  Aa(s)~ 1).  then 
q(s)  =  q( s - )  +  1.  Hence, 

z(s,n)  :=  l(q(5)=«n|  =  l|q(5-)+i-n|  =  z(s -,n - 1 ), 

so  that  Az(s,n)  =  z(s-,n -1 )  l(n>i)  -  z(s-,n)  . 

Similarly,  if  Ad=l,  then  Az(s,n)  =  z(s-,n  +  l)  -  z(s-,n). 

Assuming  that  the  counting  processes  a  and  b  have  F-intensities,  t — ►l(t.w)  and 
t— ♦u(t.w),  we  can  accumulate  the  previous  equations  to  write 

t 

z(t,n)  -  z(0,n)  =  /  A  z(s,n)  (d  a(s)  +  d  d(s) ). 

o 


Then,  as  in  the  last  example,  by  adding  and  subtracting  Lebesgue  integrals  of  the 


intensities,  we  obtain 


z(t,n)  -  z(0,n)  =  f  A  z(s,n)  (l(s)ds  4  u(s)ds  4  d  m(s)  4  d  v(s) 
o 


=  /  (z(s,n-l)l(n>l)  -  z(s,n))  l(s)  ds)4 


4-  f  (z(s,n+l)-z(s,n)l(n>0)u(s)ds)+M(t)+V(t). 

o 


Thus,  using  the  linearity  of  the  Lebesgue  integral  and  the  fact  that  the  sum  of 
two  martingales,  in  this  case  M  and  V,  is  again  a  martingale,  we  have 


z(t,n)  -  z(0,n)  =  /  f(s)  ds  4  m(t)  =  B(t)  +  m(t), 
o 


as  the  semi-martingale  representation  of  z,  where 


f(s)  =  (  z(s,n— 1)  l(n>l)  -  z(s,n)  )  l(s)  4  (  z(s,n41)  -  z(s,n)  l(n>0))u(s) 


m(t)  =  M(t)  +  V(t)  =  f  (z(s-,n-l)l(n>l)-z(s-,n))(da(s)-l(s)ds) 


+  f  (z(s-,n4l)  -  z(s-,n)l(n>0))(dd(t)-u(s)ds)). 
o 
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Chapter  6.  Stochastic  Integrals 


G.l.  Introduction:  N.  Wiener  [19*23]  defined  a  stochastic  integral  with 
Brownian  motion  integrators  and  deterministic  integrands.  K.  Ito  [19 14. 1031] 
developed  a  stochastic  integral  for  a  class  of  processes  which  are  optional  (non- 
anticipating)  relative  to  Brownian  motion. 

In  h  is  construction,  aside  from  the  properties  of  any  continuous  martingale,  Ito 
only  used  two  properties  of  Brownian  motion.  Namely,  that 

(W(t),  t>0)  (1) 

and 

(W2(t)  -  t,  t>0)  (2) 


are  martingales,  where  W  is  standard  Brownian  motion. 

6.1.1.  J.  Doob  [1953]  extended  stochastic  integration  of  Ito  to  the  class  of  square 
integrable  martingales.  In  their  important  paper,  "On  Square  Integrable  Mar¬ 
tingales”,  Kunita  and  Watanabe  used  the  following  result  analogous  to  equation 
(2)  for  square  integrable  martingales:  Since  m  is  a  square  integrable  martingale. 
m2  is  a  submartingale,  so  by  the  Doob-Meyer  Theorem,  there  is  an  increasing 
process,  denoted  <m,m>,  such  that 

(m2(t)  -  <m,m>(t),  t>0)  (3) 

is  a  martingale.  We  will  formally  introduce  <m,m>  below,  but  equation  (3)  has 
already  occurred  in  Chapter  1  for  discrete  processes  so  the  reader  should  not  have 
difficulty  with  it.  For  continuous  process,  we  will  only  point  out  that  in  the  case 
of  Brownian  motion,  <m,m>(t)  =  t,  in  which  case  equations  (2)  and  (3)  agree 
and,  also,  that  the  Kunita  Watanabe  stochastic  integral  reduces  to  the  Ito 
integral  when  the  martingale  integrator  is  Brownian  motion. 

Stochastic  calculus  is  still  young  enough,  in  terms  of  the  length  of  time  it  takes 
for  significant  mathematical  theories  to  develop,  that  it  is  almost  always 
presented  as  it  was  developed  historically.  We  will  call  this  the  traditional 
approach.  Dellacherie's  1978  talk  at  Helsinki  is  an  exception,  and  in  some  ways 
Jacod  [1979]  is  also.  We  will  follow  Jacod. 
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In  the  traditional  approach  the  stochastic  integral  is  developed  as  outlined  in  the 
next  Section.  .As  described  there,  it  is  defined  first  for  a  particular  class  of  inter- 
grands  consisting  of  linear  combinations  of  (previsible)  indicator  processes  relative 
to  a  square  integrable  martingale,  m.  The  actual  definition  of  this  “elementary 
stochastic  integral”  is  given  in  terms  of  the  transforms  of  Chapter  1.  The  first 
extension,  to  the  space  of  square  integrable  martingales,  is  accomplished  using  the 
path-wise  stochastic  Lebesgue  Stieltjes  integral  relative  to  the  increasing,  previsi¬ 
ble  process,  <m,m>.  However,  as  we  have  pointed  out  earlier,  this  process  does 
not  exist  when  the  underlying  martingale  does  not  have  moments  of  the  second 
order.  To  remedy  this  situation  a  second  increasing  process  is  created  that  does 
not  require  the  existence  of  second  order  moments.  It  is  the  continuous  parame¬ 
ter  analogue  of  the  optional  quadratic  variation  process  of  Chapter  1  and  is  also 
denoted  by  [m,m].  As  in  Chapter  1,  if  the  process  m  has  second  order  moments, 
so  that  <m,m>  exists,  [m,m]  -  <m,m>  is  a  martingale.  Hence,  when  m  is  not 
in  Lo,  the  <  ,  >  process  is  defined  as  the  dual  previsible  projection  of  the  pro¬ 
cess  [m,mj.  Formally  then,  the  development  of  the  stochastic  integral  for  the 
larger  class  of  integrators  proceeds  as  before  in  terms  of  a  Lebesgue-Stieltjes 
integral  relative  to  [m,m]. 

This  then  becomes  the  procedure  for  extending  the  integral  to  an  ever  widening 
circle  of  families  of  processes,  culminating  with  its  final  extension  to  semi¬ 
martingale  integrators  and  locally  bounded  previsible  integrands.  At  each  step, 
preparation  for  the  next  extension  is  made  by  first  extending  the  increasing  pro¬ 
cess,  [m,m],  and  then  repeating  the  definition  of  the  next  more  general  stochastic 
integral  in  terms  of,  notationally,  the  same  defining  equation  as  utilized  at  the 
previous  step. 

6.1.2.  Embedded  in  the  procedure  just  sketched  is  a  method  of  extending  the 
(integrator)  processes  themselves.  We  have  already  encountered  one  example  in 
going  from  martingales  to  local  martingales.  This  is  the  method  of  localiza¬ 
tion.  It  is  one  of  the  most  important  applications  of  stopping  times  in  the  theory 
of  martingales.  It  goes  as  follows. 

Let  (  0  ,  H,  (F(t),  tcR+  ),  P  )  be  a  filtered  probability  space  satisfying  the 
‘‘usual  conditions”.  Let  C  be  a  family  of  processes  (  equivalence  classes  of  indis¬ 
tinguishable  processes  )  defined  on  this  probability  space.  Denote  by 
C|oc  =  C|oc  (F,P)  the  family  of  processes,  X,  defined  on  the  same  probability 
space  for  which  there  exists  an  increasing  sequence,  (  Tn  ,  nc  Z+  ).  of  stopping 
times,  Tn  f  oo  a.s.P.,  such  that  each  stopped  process  X  "(C.  For  example,  let¬ 
ting  Mu  be  the  set  of  uniformly  integrable  martingales,  its  localization,  is  (Mu)lof, 
the  family  of  local  martingales.  We  have  seen  that  Mu  C  (Mu)ioc  in  Chapter  5. 
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Cloc  is  called  a  localized  class.  Jacod  [1979]  proves  a  number  of  interesting 
results  on  the  algebra  of  localized  classes.  For  instance,  he  shows  that  if  a  class. 
C,  a  vector  space  of  processes,  is  closed  under  the  operation  of  stopping  (called 
stable  under  optional  stopping),  then  (C|oc  )loc  =  C!oc.  The  reader  should 
ponder  this  result  in  relation  to  the  family  Mu. 

On  our  way  to  extending  stochastic  integrals,  we  will  apply  localization  to  a 
number  of  classes  of  processes.  This  will  be  carried  out  a  little  differently  for 
integrands  than  for  integrators,  for  obvious  reasons.  In  any  case,  the  class  of 
bounded,  previsible  processes  becomes  the  class  of  locally  bounded,  previsible 
processes,  and  the  class  of  processes  of  integrable  variation  becomes  the  class  of 
processes  of  locally  integrable  variation.  We  will  prove  that  the  class,  S,  of  semi- 
martingales  cannot  be  extended  by  localization:  S  =  Sioc  (Jacod  [1979]). 

The  stochastic  integral  will  not  be  extended  beyond  the  class  S  of  integrators. 
The  reason  for  this  is  simple.  It  cannot  be  done.  That  is,  it  can’t  be  done  if  we 
want  sequences  of  stochastic  integrals  to  have  the  following  natural  Cauchy  pro¬ 
perty: 

Let  (  ht*n)  )  be  a  uniformly  bounded  sequence  of  previsible 
processes.  Then  the  point-wise  convergence  of  this  sequence 
to  0  with  n— *00,  implies  that 

/  hs(n*  d  Xj  — '  0,  in  probability,  as  n  — ♦  oo 

M 

for  all  t,  where  X  is  in  S. 

Bichtler  [1981]  proved  that  if  an  integrator  is  Skorokhod,  adapted  and  the 
corresponding  stochastic  integral  possessed  this  Cauchy  property,  then  the 
integrator  is  necessarily  a  semi-martingale. 

The  material  in  this  chapter  is  based  primarily  on  Jacod  [1979],  Kunita  and 
Watanabe  [1967],  Doleans-Dade  and  Meyer  [1970],  Meyer  [1976],  Rogers  [1981], 
and  Dellacherie  and  Meyer  [1978].  Dellacherie  [1978],  Chung  and  Williams  [1983], 
and  Ikeda  and  Watanabe  [1981]  were  also  used. 

6  2.  An  Outline  of  the  Construction  of  Stochastic  Integrals: 

6.2.1.  Introduction:  In  this  Section,  we  will  attempt  to  outline  the  traditional 
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approach  to  constructing  the  stochastic  integral.  In  succeeding  Sections,  we  will 
mostly  follow  the  development  of  Jacod  [1979],  Although  Jacod’s  development 
does  not  begin  with  elementary  processes  and  simple  integrals,  there  is  much  in 
common  with  the  outline  given  here.  The  principal  reason  for  following  Jacod  is 
that  it  leans  more  heavily  on  martingale  methods  (  the  Strasbourg  variety  ),  than 
on  the  methods  of  classical  functional  analysis.  It  therefore  appears  to  be  more 
succinct  and  self-contained  than  the  traditional  approach. 

What  we  are  referring  to  as  the  traditional  approach  begins  the  way  most  of  us 
would  expect.  However,  as  observed  by  Rogers  [1981],  some  very  clever  new 
ideas  were  required  to  successfully  carry  out  the  original  development  of  the  sto¬ 
chastic  integral.  As  noted  earlier,  this  was  done  by  Kiyosi  Ito  in  the  1950's  for 
Brownian  Motion  and  extended  to  square  integrable  martingales  in  the  1960's  by 
Kunita  and  Watanabe.  P-A.  Meyer  and  the  Strasbourg  School  of  probabilists  are 
mostly  responsible  for  the  final  extension  to  semi-martingales  in  the  late  1960-70 
time  frame. 


6.2.2.  Outline:  The  first  integral  to  be  introduced  in  this  Section  is  called  the 
elementary  stochastic  integral.  In  French  literature,  it  is  called  the  triviale 
stochastic  integral,  translated  as  the  “obvious  stochastic  integral  ".  As  demon¬ 
strated  in  an  example  by  Rogers,  aside  from  starting  with  processes  whose  sample 
paths  are  simple  step  functions  and  defining  their  integral  as  a  finite  sum,  the 
definition  of  the  elementary  stochastic  integral  is  neither  trivial  nor  obvious. 


Let  (Q.  H,  (F(t)),  P)  be  a  filtered  probability  space  with  F(oc)  =  cr(  (jF(s)) 

s>0 


contained  in  II  and  assume  the  “usual  conditions”. 


6.2.3.  Let  the  family  E  designate  the  vector  space  of  linear  combinations  of  indi¬ 
cator  functions  of  rectangular  subsets  of  (s,  tjxfl  of  the  form  (s,  tjxA,  with 
AcF(s),  and  s<t,  s,t  in  R+;  in  other  words,  E  consists  of  linear  combinations  of 
the  kernel  processes  which  generate  the  F-previsible  cr-algebra.  More  precisely, 
let  E  be  the  family  of  processes  H=(H(t),t>0)  such  that  H  is  adapted,  left  con¬ 
tinuous,  bounded  and  for  which  there  exists  a  finite  set  {tk  :  k  =  0.1,2 n.n+1} 

which  partitions  [0,oc], 

t0  =  0<  t ,<■--<  tn  <  tn+l  =  oc. 


and  such  that  t  — ►  H(t.w)  is  constant  on  each  subinterval  of  the  partition. 
Further,  assume  that  each  r.v.  II|  =  lift,  )  is  F(tj)  measurable. 
i=0,l . n.n  +  1. 


6.2.4.  Definition:  Let  M  be  a  bounded  martingale  and  H  c  E.  Then  the  elemen¬ 
tary  stochastic  integral.  H.M.  is 


(H.Nl)(t)  :=  THkAM‘  (4) 

k  >0 

:=H(0)M(0)+  V  H(tk)  (M(tk+,  *  t)-M(tk~  t)). 

k  =  l 

6.2.5.  Remark:  Based  on  Chapter  1,  this  is  nothing  more  than  the  martingale 
transform  of  an  integrable,  stopped,  bounded  martingale  M  by  the  bounded  prev- 
isible  process  H,  hence  we  have  immediately  that  H.M  is  a  martingale. 

To  ease  the  notational  burden  for  the  reader,  we  will  often  write 

t 

JTl(s)  dM(s)  :=  (H.M)(t).  Notice  that  with  the  definition  as  in  (4),  there  should 
o 

be  no  ambiguity  in  meaning  if  we  set  the  upper  limit  in  the  last  expression  equal 
to  the  symbol  oo. 

In  general,  here  and  in  the  sequel,  when  the  notation  H(s)  becomes  too  cumber¬ 
some  because  of  superscripts  and  such  we  will  write  Hs  for  H(s).  Though  perhaps 
ambiguous  here,  this  should  not  be  the  case  in  actual  usage. 

Now,  since  M  is  a  square  integrable  martingale,  M2  =  (Mf.  t>0)  is  an  F- 
submartingale  and  so  the  Doob-Meyer  decomposition  theorem  of  Chapter  5 
guarantees  the  existence  of  an  increasing,  previsible  process,  A,  with  the  property 
that  M*  -  A  is  an  F-marting3le.  (We  have  already  used  the  notation 
<M,\1>  =  A.) 

Then  it  is  easy  to  see  that  for  a  simple  process  H, 


0 


E(  v;Hlf(AMk)2) 

k  >0 


=  E(  V)  H  2  E(  (AMk)2  |  Fk  )) 

k>0 

oo 

=  E(  (H2.<M,M>)°°  )  =  E/Hs2d<M.M>s  . 

o 
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where  we  have  used  the  previsibility  of  H  and  the  adaptiveness  of  M  to  obtain 

E(HjHkAMjAMk  )  =  E(HjHkAMj  E(  AMk  |  Fk_,  )  )  =  0 

for  0<j<k  ,  and  so  the  first  equation.  To  go  from  the  first  equation  to  tfi.- 
second  use  either  the  definition  of  <M,M>  from  Chapter  1,  or  a  simple  relation¬ 
ship  between  \1  and  <M,\1>  (in  fact,  the  reason  for  the  name  "quadratic  varia¬ 
tion  ")  that  will  be  derived  later  in  this  Chapter. 

Since  the  F-previsible  (T-algebra  is  generated  by  the  kernel  processes,  any  previsi- 
ble  process  is  the  limit  of  a  sequence  of  simple  processes.  Let 

OO 

Lo(M)  :=  (H  :  H  previsible,  E/H2d<M,\l>s  <  oc  }, 

o 

where  the  last  integral  is  a  Stochastic  Lebesgue-Stieltjes  integral  (Chapter  3)  rela¬ 
tive  to  the  nondecreasing  process  the  dual  previsible  projection  of  M. 

Let  m  be  a  square  integrable  martingale  and  set 

X'  J_ 

|H]ll  -(m)  E( /H2d<m.m> )  2  .  Then  |H]|L=(m)  is  a  norm  on  L2(m).  Notice  that 

o 

since  H  is  previsible,  L2(m)  is  independent  of  the  choice  of  martingale  compensa¬ 
tor  of  nr.  (Just  recall  the  results  in  Chapter  4  on  dual  previsible  projection.) 

Define  K2  to  be  the  space  of  square  integrable  martingales,  m. 
(suptfR  Ems2  <oc).  The  phrase  “square  integrable”  is  a  result  of  the  fact  that 
as  a  submartingale,  m2.  has  an  increasing  mean-value  so  that  when  the 
supremum  is  taken  over  compact  intervals,  [0,T],  rather  than  over  R  +  ,  “square 
integrability”  indeed  just  means  Emf  <  oc,  or  existence  of  second  order 
moments. 

From  Chapter  2,  we  know  that  if  m  is  square  integrable,  then  it  has  a  terminal 
r.v.,  m^..  Let  :=  (Em^,)2  be  the  norm  on  K2. 

The  following  Theorem  is  then  proved  in  almost  every  presentation  of  the  sto¬ 
chastic  integral.  It  establishes  an  isometry  between  L2(m)  and  K2. 

6.2.6.  Theorem: 

The  mapping  H— *H.m,  from  5  to  bounded  martingales,  can  be  extended  uniquely 
as  a  norm  preserving  operator  from  L2(m)  onto  K2,  and  will  continue  to  be 
denoted  by  H— *H.m. 


6.2.7.  Next,  it  may  be  verified  (  as  in  Chapter  1  )  that 

o  XT  =  l((0Tj].X,  for  all  optional  T. 
o  A(H.X)t  =  HtAXt.  a.s.P.  t>0. 

Seemingly,  in  all  integration  theories  the  difficult  work  begins  with  K2,  the  space 
of  square  integrable  martingales.  Much  more  will  be  said  about  this  space  in  the 
next  Sections.  Kunita  and  Watanabe  [1067]  give  fundamental  results  on  the 
decomposition  of  the  space  K'\  To  prepare  for  this  we  need  to  know  that  a 
'  stable-'  subspace,  Q,  of  K",  is  essentially  just  a  closed  subspace  of  K",  that  is 
closed  under  stopping.  We  need  the  following 

6.2.8.  Definition::  Processes  m.ncK2  are  said  to  be  orthogonal  if  the  process 
inn  =  (mtnt,t>0)  is  a  martingale. 

Remark:  It  will  be  shown  later  in  this  Chapter  that  if  m  and  n  are  square  integr¬ 
able  and  vanish  at  the  origin,  m0  =  0  =  n0,  then  m  and  n  are  orthogonal  if 
Em-i-n-p  =  0,  for  every  stopping  time  T. 

Kunita  and  Watanabe  [1967]  prove:  If  Q  is  a  stable  subspace  of  K2,  then  every 
martingale,  m,  in  K2  can  be  uniquely  decomposed  into  a  sum,  m  =  x  4-  y, 
where  x  belongs  to  Q  and  y  is  orthogonal  to  every  element  of  Q. 

If  one  recalls  (Chapter  2)  that  the  norm  in  K2  is  equivalent  to  the  Lo(P)  norm  of 
the  supremum  process,  mt*  =  sups<tmf,  it  is  easy  to  show  that  the  space  of 
continuous,  square  integrable  processes  is  stable.  If  we  call  this  space  Q,  and 
observe  the  convention  that  m0_  =  0,  we  have  Q  CK02,  the  latter  being  the  set 
of  square  integrable  martingales  that  vanish  at  the  origin. 

Applying  the  decomposition  theorem  of  Kunita  and  Watanabe,  this  Q  yields  a 
unique  decomposition  of  any  square  integrable  martingale,  m,  into  a  continuous, 
square  integrable  martingale,  mr,  and  a  “ purely  discontinuous"  square  integrable 
martingale.  md,  which  is  orthogonal  to  every  element  of  Q. 

The  space  of  purely  discontinuous  martingales  will  be  described  in  a  later  Sec¬ 
tion.  For  now  it  is  sufficient  to  know  that  this  space  is  the  closure  in  K2  of  a 
relatively  simple  class  of  martingales  whose  paths  are  of  bounded  variation,  a.s.P. 
But  not  every  purely  discontinuous  martingale  is  of  bounded  variation.  In  con¬ 
trast  to  this,  every  nonconstant,  continuous  (nonzero,  by  the  convention 
m0  —  0).  martingale  has  (a  s.  P)  paths  of  unbounded  variation.  This  follows 
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easily  from  the  Doob  Meyer  decomposition  theorem  by  assuming  that  such  a 
(necessarily  previsible)  martingale  is  of  bounded  variation. 

For  these  reasons  the  construction  of  a  stochastic  integral,  even  within  the  space 
of  square  integrable  martingales,  is  a  formidable  affair. 

Kunita  and  Watanabe  also  define  a  process,  <m,n>,  for  square  integrable  mar¬ 
tingales  (recall  Chapter  1  for  the  discrete  analogue),  as  the  unique,  previsible  pro¬ 
cess  with  the  property  that  mn  -  <m,n>  is  a  martingale.  Note  that  <m,n>  = 
0  then  becomes  a  sufficient  condition  for  orthogonality  of  m  and  n.  This  new 
process,  which  is  of  bounded  variation,  is  used  by  Kunita  and  Watanabe  to 
characterize  the  process,  H.m,  as  opposed  to  the  operator  H— ►H.m.  But  it  is  clear 
that  existence  problems  arise  when  m  and  n  are  not  square  integrable.  To  cope 
with  this  difficulty,  Meyer  introduced  a  process  denoted  by  [m,n]  which  exists 
even  when  m  and  n  are  not  square  integrable  and,  like  the  process  <m,n>,  is  a 
process  of  bounded  variation. 

Finally,  Kunita  and  Watanabe  created  a  type  of  Schwarz  inequality  in  terms  of 
the  process  <m,n>,  (  given  later  in  this  chapter  in  terms  of  the  process  [m,n]) 
and  used  Stochastic  Lebesgue-Stieltjes  integrals  (introduced  in  Chapter  3)  with 
previsible  integrands  to  establish  the  following  characterization  of  the  stochastic 
process,  H.m: 

6.2.9.  Theorem: 

//m,ncK2  and  HcLo(m),  then 

OO 

E(  /  |  Hs  |  |  d<m,n>  |  )  <  oo 
o 

and  the  stochastic  integral,  H.m,  is  the  unique  element  of  K2  (up  to  indistinguisha- 
bility)  which  satisfies  the  equation 

[H.m,  nj  =  H.[m,n] 

for  every  n  in  K2. 

This  rest  of  the  development,  as  noted  in  the  introduction  to  this  Chapter,  con¬ 
sists  of  a  succession  of  extensions  of  the  quadratic  variation  processes  and  of  the 
stochastic  integral  which  culminate  in  the  stochastic  integral  of  locally  bounded. 


previsible  processes  relative  to  local  martingales  and  thence  to  semi-mart ingaie 
integrators. 

The  Jacod  development  starts  with  the  definition  of  the  stochastic  integral  of  a 
local  martingale,  albeit  a  continuous  one.  The  attractive  feature  of  his  approach 
is  that  it  focuses  on  semi-martingales  from  the  beginning.  With  some  minor 
exceptions  (occurring  with  the  treatment  of  purely  discontinuous  processes  and  in 
the  preparation  for  the  definition  of  quadratic  variation)  Jacod’s  approach  is  fol¬ 
lowed  in  the  remainder  of  this  chapter. 

6.3.  Some  Extensions  to  Chapters  3-5:  In  this  Section  we  will  bring  together 
and  extend  some  of  the  material  in  Chapters  3,  4  and  5.  As  usual,  let 

(H,  F(  oo  ),  (F(t),  t  >  0  ),  P  ),  where  F(  oo  )  =  a  (  (jF(t)),  be  the  underlying 

t>o 

filtered  probability  space.  Recall  (Section  3.2)  the  definitions  of  increasing 
processes  and  the  notation  for  the  classes  of  increasing  processes,  V+,  of  processes 
of  bounded  (finite)  variation,  BV  =  V+  -  V+,  and  of  processes  of  integrable  vari¬ 
ation,  IV  =  IV+  -  IV+.  (Of  course,  we  mean  V+  =  V+  (F,P), 
BV'  =  BV(F,P),  and  so  on.)  Let  C  be  a  class  of  processes.  We  write  C0  for  the 
set  of  all  A*C  with  A(0)  =  0. 

If  A  c  BV,  then  B(t)  =  /  |  dA(s)  |  denotes  the  variation  of  the  process  A. 

|0.t| 

It  is  the  unique  (to  indistinguishability)  process  of  V+  such  that  the  measure 
(O.t]  — *  dB(t,w)  on  R+  is  the  total  variation  of  the  signed  measure 
(O.t]  — »  dA(t,w). 

6.3.1.  Using  the  notation  introduced  in  Section  6.1.2,  an  increasing  process.  A. 
(  A  c  V+),  with  A(0)  =  0  is  said  to  be  locally  integrable  if  At(IV0+  )loc.  That  is, 
if  there  exists  an  increasing  sequence,  Tn  |  oo,  a.s.P,  of  stopping  times  such  that 
EA  "  <  oo.  Since  A(t~Tn  )  <  A(Tn  )  we  can  and  will  write  the  condition  as 
EAXn  <  oo.  If  A(0)  0,  then  the  definition  applies  to  the  process 
t  — ♦  A(t)  -  A(0)  and  E(A(0)  |  F(0))  <  oo  is  required.  In  this  case  we  write 
Af(IV),?c. 

A  is  said  to  be  of  local  integrable  variation  if  A  t  BV  and  the  process 

t  — *  f  |  dA(s)  |  :=B(t)  is  locally  integrable.  In  this  case  we  write  Af(IV)|„r. 

[o.t] 

More  succinctly,  Ac(rV)i0C  iff  B^IV),^. 

When  the  local  in tegrability  of  the  variation  process,  B,  is  at  issue  we  can  use  the 
fact  that  if  EB(T)  <  oo  and  EB(S)  <  oo  for  two  stopping  times,  then 
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EB(max(S,T))  <  oo.  Therefore,  for  local  integrable  variation,  we  don’t  have  to 
require  that  Tn  ]  oo;  sup  n  Tn  =  oo  is  good  enough. 


6.3.2.  The  first  two  results  in  this  Section  concern  local  variation.  Simply  stated, 
increasing  previsible  processes  are  locally  integrable,  and  optional  processes  of  BY 
are  of  local  integrable  variation  iff  they  differ  from  local  martingales  by  a  previsi¬ 
ble  process  of  bounded  variation.  The  proofs  may  be  found  in  Dellacherie  and 
Meyer[1982,  VI  80]  or  Jacod[1979,  p. IT] . 

6.3.3.  Theorem: 

Let  A  be  a  process  of  bounded  variation. 

(1)  If  A  is  previsible  then  A  is  of  local  integrable  variation. 

(2)  ,4  is  of  locally  integrable  variation  iff  there  exists  a 
unique,  previsible  process  B  c  BV  such  that  A  -  B  is  a  local 
martingale. 

B  is  unique,  modulo  indistinguishability,  and  is  called  the 
dual  previsible  projection,  or  the  previsible  compensa¬ 
tor  of  the  A. 

This  extends  the  Chapter  4  notion  of  the  dual  previsible  projection  of  increasing, 
integrable  processes  to  bounded  variation  processes  of  local  integrable  variation. 

6.3.4.  Remark:  We  will  sketch  the  proof  the  first  part  of  this  Theorem  with  the 

aim  of  giving  the  reader  a  feeling  for  the  use  and  force  of  these  new  definitions. 
In  part  (1),  since  A  is  of  bounded  variation  and  previsible,  the  total  variation  pro¬ 
cess  of  A  is  previsible.  Therefore,  without  loss  of  generality,  take  A  to  be  increas¬ 
ing  and  A(0)  =  0.  Set  Sn  =  inf(  t  :  A(t)  e  [n,oo)  ),  where  1  <  n  t  Z+  .  Then 
by  the  results  concerning  the  debut  of  random  sets  in  Chapter  2,  Sn  is  a  previsi¬ 
ble  stopping  time.  It  is  strictly  positive  since  A(0)  =  0,  and  it  can  be  shown  to 

be  previsible  since  A  is  previsible.  Therefore,  for  each  n,  Sn  has  an  announcing 

sequence,  (  Snk  ,  ktZ+  )  and  since  Snk  <  Sn  for  all  k  (since  Sn  is  strictly  positive), 

we  have  by  the  definition  of  Sn  that  A(Snk  )  <  n  for  all  k.  Therefore, 

EA(Snk  )  <  oo.  Since  sup{Snk  :  n.k)  =  oo,  A  is  locally  integrable. 

The  following  Corollary  is  useful  and  obvious.  It  is  a  generalization  of  the  result 
in  Chapter  4  which  said  that  f\'0  martingales  have  evanescent  dual  previsible 
projections. 


1  r> 


6.3.5.  Corollary: 


If  .1  is  of  bounded  variation,  then  A  is  a  local  martingale  iff  .3  is  of  local  intcgr- 
able  variation  and  the  previsible  compensator  of  A  is  evanescent. 

6.4.  Some  Spaces  of  Martingales:  Let  (Q,  F(oo),  F,  P),  where 

F(co)  =  <t(  F(t)  )  and  F  =  (F(t),  tcR+),  be  a  filtered  probability  space.  Let 
t>o 

Nlu  =  MU(F,P)  be  the  space  of  uniformly  integrable  F-martingales.  Of  course  Mu 
is  a  set  of  equivalence  classes  of  processes  under  indistinguishability. 

As  noted  in  Chapter  2,  if  X  belongs  to  Mu,  then  (X(t))  converges  a.s.P,  and  in  L,, 
as  t— *oo,  to  a  terminal  random  variable  X(oo)  and  we  can  write 
X(t)  =  E(  X(oo)  |  F(t)  ).  The  converse  was  also  discussed,  so  we  know  that  if 
ZcL^Q.H.P)  there  exists  a  unique  X  in  Mu  such  that  X(t)  =  E(Z  |  F( t ) )  and 
X(  oo)  =  E(Z  |  F(  ooj).  It  follows  that  Mu  is  mapped  bijectively  onto 
L,(n,F(oo),P). 

6.4.1.  Definition:  A  (right  continuous)  supermartingale,  X  =  (X(t),  F(t )).  is 
said  to  belong  to  the  class  D  iff  the  family  of  random  variables. 
(X(T)  :  T  any  finite  F-stopping  time  },  is  uniformly  integrable. 

6.4.2.  We  now  characterize  Mu  as  a  subset  of  (Mu)loc.  For  convenience  we  will 
write  M)oc  :=  (Mu)joc  throughout  this  chapter. 

6.4.3.  Lemma:  Let  XfM|oc.  Then  XcMu  iff  X  belongs  to  the  class  D. 

Remark:  We  will  indicate  the  proof.  Let  X  belong  to  Mu.  Then  by  our  previous 
remarks,  there  exists  XfocJcL,  such  that  X(T)  =  E(X(oe)  |  F(T)),  for  each 
optional  T.  As  noted  in  Chapter  2  (Doob's  Optional  Stopping  Theorem),  if  we 
let  T  range  over  the  set  of  finite  (i.e.  real)  valued  optional  times,  then  the  family 
{X(T)J  is  uniformly  integrable  and  X  belongs  to  the  class  D. 

Conversely,  let  X  be  a  local  martingale  in  the  class  D.  Then,  in  particular  the 
family  {X(t),tcR+}  is  uniformly  integrable.  Thus,  it  remains  to  show  that 
(X(t),tcR  +  )  is  a  martingale.  Let  (Tn)  be  a  localizing  sequence  for  the  local  mar¬ 
tingale  X.  Then  for  s  and  t  real  numbers,  s<t.  the  families  {  X(t  *  Tn).  n<Z+  } 
and  {  'Xls-  Tn).  n«Z+  }  are  uniformly  integrable  and  the  corresponding 
sequences  (s  and  t  are  fixed)  converge  a.s.P.  and  in  L,  to  the  random  variables 
X(t)  and  X(s),  respectively.  By  definition  of  X  as  a  local  martingale. 
XT"(s)  =  E(XT"(t)  |  F(s) )  for  each  n.  It  follows  (using  Jensen’s  inequality)  that 


|  E((XT"(t)-X(t))  ]  Fs)  |  <  E(  |  XT,’(t)-X(t)  |  |  Fs).  Taking  expectations  of 

both  sides,  we  obtain  E(  (  E(XT"(t)  |  Fs)  -  E(X(t)  |  Fj  (  ) 
<  E  |  XT"(t)  -  X(t)  j  .  Having  noted  that  (XT"(t),  n>l  )  converges  to  X(t)  in  L,. 
and  similarly  (XT"(s),  n>l)  to  X(s),  we  have  X(s)=E(X(t)  |  Fs),  hence  XfMy. 

6.4.4.  Remarks:  Now  let  m*(t)  =  sups<t  |  m(s)  j  for  any  process 

m  =  (m(s),sfR+).  Let  pe[l,oc]  and  ||  Y  Jpp  =  E(  |  Y  |  p),  the 
Lp  =  Lp(n,  F,  P)  norm  of  Y  of  order  p.  Recall  that  if  p=oo,  denotes  the 
family  of  F-measurable,  bounded  functions.  Set 


Kp  =  {  meMlo<;  :  ||  m*(oo)  L  <  oo  }. 


Then  Kp  C  Mu,  for  p>l.  This  is  because  Kp  C  Kp  for  all  p'  <  p,  by 
Holder’s  inequality,  and  so  in  particular  Kp  C  K1.  But  if  mtK1,  then  m*(cc)  is 
P-integrable  and  so  {m}  is  in  the  class  D.  Therefore,  by  the  last  Lemma.  mcMu. 
That  is,  K1  =  Mu. 

6.4.5.  As  noted  in  Chapter  2,  if  p  >  1  then  the  norms  ||m(oo)|p  and  ||m*(oc)||p  are 
equivalent.  Therefore  if  pc(l,oo],  then  Kp  can  be  equipped  with  the  norm  defined 
by  the  mapping  m  — » ||m(oo)||p.  In  this  manner,  Kp  is  identified  with  the  space 

Lp  (Q,  F(oc),  P)  through  the  bijection  (mtKp)  ♦ - ♦  m(oo)cLp,  for  p  >  1.  (Recall 

also  that  there  exists  a  bijection  between  Mu  and  L1.) 

The  space  K2  is  called  the  space  of  square  integrable  martingales  or  the 
space  of  Lo  -  bounded  martingales.  (These  were  also  defined  in  Chapter  2.)  By 
remarks  in  the  previous  paragraph,  the  space  K2  is  identified  with  the  Hilbert 
space  Lo,  having  norm  m— *|jm(oc)||o  and  scalar  product  (m,n)— *E(m(oc)n(oc)). 

Set  M0  equal  to  the  space  of  martingales  such  that  mrMu  and  m(0)  =  0.  We  will 
write  (M0)ioe  as  M  O.loc- 

6.4.6.  Following  Jacod  [  1979]  we  state  the  following. 

Definition:  Let  m  and  n  be  local  martingales.  Then  m  and  n  are  said  to  be 
(strongly)  orthogonal  if  the  product  mn  is  a  local  martingale  which  vanishes  at 
the  origin. 

Remark:  In  the  traditional  source,  Meyer  [1975],  first  defines  orthogonality  for 
square  integrable  martingales;  he  then  extends  this  to  local  martingales  as  above. 


Ill 


I 

For  square  integrable  martingales  he  defines  m  to  be  orthogonal  to  n  if 
m(0)  =  0  and  Em(T)n(T)  =  0  for  all  stopping  times,  T.  He  then  proves  that 
J  m.ruK2  are  orthogonal  iff  the  product  mncK1  and  m(0)  =  0,  Jacod's  definition 

>  restricted  to  the  space  K2.  To  show  this  characterization  of  orthogonality,  one 

•  needs  the  following  interesting  characterization  of  Mu.  One  should  refer  back  to 

>  the  proof  of  Doob's  Optional  Sampling  Theorem  (  in  Chapter  1  )  for  the  genesis 

j  of  this  theorem.  Actually,  that  the  equality  of  expectations  of  an  integrable  pro¬ 

cess  at  different  finite  stopping  times  is  equivalent  to  the  Chapter  1  form  of 
Doob's  Theorem  for  martingales  is  sometimes  called  Komatsu’s  Lemma.  This 
Lemma  will  also  be  used  in  the  proof  of  the  Theorem  characterizing  stochastic 
integrals  relative  to  a  continuous  local  martingale. 

6.1.7.  Lemma: 

Let  L  be  an  adapted  Skorokhod  process  for  which  limL(t)  (==:  L(oc)  )  exists. 

t— 00 

Then  L  is  in  Mu  ifl’LlO)  is  P-integrable  and  E(L(T))  =  E(L(0))  for  all  stopping 
times,  T. 

Remark:  If  L  is  in  Mu  this  follows  from  Doob's  Optional  Stopping  Theorem.  For 
the  converse  take  T  =  tA,  the  restriction  of  the  constant  stopping  time  t  to 
some  A  in  F(t).  Then  E(L(T))  =  E(L(0)).  Since  E(L(oc))  =  E(L(0))  (S  —  oc 
is  an  optional  time),  decomposing  both  expectations  over  A  and  Af,  it  follows 
that  E( L(t )1A)  =  E(L(oo)1a).  That  is,  the  last  equation  holds  because 

EL(0)  =  EL(T)  =  /L(t)  dP  +  /L(oo)  dP 

A  Ac 


and 


EL(0)  =  EL(oc)  =  JU  co)  dP  +  /  L(  00  )  dP. 

A  A' 

Since  A  is  an  arbitrary  event  in  F(t),  we  have  L(t)  =  E(L(oc)  |  F( t ) )  and  so  L 
belongs  to  \lu. 

Remark:  It  follows  easily  th^n  that  this  martingale  definition  of  orthogonality  is 
stronger  than  the  natural  orthogonality  in  the  Hilbert  space  L2  «—  K2  under 
the  inner  product  condition  Emfoc  )n(  oc)  =  0: 
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6.4.8.  Theorem: 

If  the  square  integrable  martingales  m  and  n  are  strongly  orthogonal,  then 
Em(oc)nfoc)  =  0  and  the  product  mn  is  a  martingale  in  K0'. 

Remark:  We  could  equally  well  claim  that  for  all  stopping  times  T.  mT  and  nT 
are  orthogonal  in  L2.  This  follows  since  m^*,  and  n^,  in  L2,  implies  that  the  pro¬ 
duct  m^,  n^  eh{.  But  then  (mn)^,  <  m,*.  n^,  so  that  mnfK0'.  It  follows  that 
Envp  nT  —  Em0  n0  =  0,  by  the  Lemma. 

A  converse  also  holds:  If  m0n0  =  0  and  mT<  nT  are  orthogonal  in  L2.  for  all  stop¬ 
ping  times  T,  then  m  and  n  are  orthogonal  in  the  sense  of  the  definition  of  strong 
orthogonality.  This  is  also  a  consequence  of  the  Lemma. 

6.4.9.  A  continuous  local  martingale  (CLM)  is  ?  local  martingale  whose  paths 
are  continuous  (for  P-almost  all  paths).  Let  M|£c  be  the  family  of  continuous 
local  martingales.  On  occasion,  we  will  also  (following  Jacod)  use  the  notations 
Kp,c,  M,£ioc  and  so  on>  ^e  same  meaning  being  carried  by  the  superscript  c; 
namely,  o  denote  various  subfamilies  of  M]£c  satisfying  the  additional  require¬ 
ment  of  path  continuity. 

6.4.10.  If  the  local  martingale  m  is  (strongly)  orthogonal  to  each  n(M0c\oc,  then  m 
is  said  to  be  a  (purely)  discontinuous,  or  a  compensated  jump  martingale. 

The  first  name  is  widely  used,  but  is  misleading  since,  for  example,  the  compen¬ 
sated  Poisson  process  (N(t)  -  X  (t),  t>0),  is  such  a  martingale  and  its  paths  are 
continuous  between  jumps. 

Let  M|oC  be  the  subset  of  M!oc  consisting  of  compensated  jump  martingales.  M|dc 

is  called  the  space  of  compensated  jump  martingales. 

Lemma: 

Let  pc[l,ooj.  Then  KJ5,  Kp,c  and  Kp,d  are  closed  subspaces  of  Kp. 

Remark:  The  proof  uses  the  equivalence  of  ||m*(oo)||p  and  |m(oc)|p  =  ||mJ|KP. 
Since  then  if  ||m^  -  mjp,  converges  to  0,  as  n— >00,  there  exists  a  subsequence  (nk) 

such  that  supl(R  |  m/nk*  -  mt  |  — * ►  0  a.s.P.  That  is,  mt*"k)(w)  — mt(w),  uniformly 
on  [0,oc],  for  all  w  in  some  set  of  P-measure  I.  Therefore,  sequences  of  continu¬ 
ous  processes  converge  to  continuous  processes,  so  Kp,c  is  closed. 

If  mkcKp  d,  for  each  k  and  n  is  any  bounded  continuous  process  with  n(0)  =  0. 
then  EnTmj  =  0  for  each  k.  Again,  if  mk  converges  to  m,  then  EnTmT  =  0 
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and  so  m  is  in  Kp,d. 


We  will  return  to  discuss  the  structure  of  K'  d  at  the  end  of  Section  6.5. 

6.5.  Semi-Martingales  Revisited:  We  now  return  to  the  definition  of  a  semi¬ 
martingale,  introduce  some  convenient  notation  and  state  some  results  that  are 
crucial  to  the  development  of  stochastic  integrals.  This  Section  and  the 
remainder  of  the  chapter  follow  Jacod  [1979]  very  closely.  In  order  to  remind  the 
reader  that  the  notation  should  not  obscure  the  simplicity  of  the  semi-martingale 
concept,  we  state  its  definition  as  follows: 

6.5.1.  Definition:  A  Skorokhod  process,  X,  is  a  semi-martingale  relative  to  a 

filtered  probability  space,  ( Q,  H,  F,  P),  if  there  exists  a  sequence  of  F-stopping 
times,  Tnfoo,  such  that  for  each  n,  there  exists  a  sequence  of  F-martingales  M(n* 
with  \Fn*(0)  =  0  and  an  F-adapted  process  of  bounded  variation,  such 

that  X(t,w)  =  M-n*(t,w)  +  Afn)(t,w)  for  all  (t,w)f[[0,  Tn)). 

6.5.2.  Remark:  Of  course,  this  is  equivalent  to  the  requirement  that  there  exist 
processes  mcM0  loc  and  Ac(BV)lo(.  such  that  X  =  m  +  A.  Notice  that  the  condi¬ 
tion  m(0)  =  0  is  no  restriction,  since  if  m(0)  7^  0  then  we  can  write  X  =  (m  - 
m(0))  +  (A  4-  m(0))  obtaining  a  representation  of  X  that  satisfies  the  require¬ 
ment.  Notice  also  that  the  requirement  that  X  be  Skorokhod  is  redundant  since 
both  M‘n>  and  A'n>  in  the  definition  are  Skorokhod. 

6.5.3.  Remark:  Let  S  =  S(F)  =  S(F,P)  denote  the  collection  of  equivalence 
classes  of  semi-martingales  on  (ft.  F(oc),  F,  P). 

If  XrS,  the  representation  X  =  m  +  A  is  in  general  not  unique.  For  a  simple, 
but  artificial,  example  let  A(t)  =  t  and  m  be  any  local  martingale  of  bounded 
variation.  Then  another  representation  of  this  X  is  X  =  0  4-  A\  where  A-  =  m 
+  A. 

A  semi-martingale  for  which  the  decomposition  X  =  m  -f  A  is  unique  is  called  a 
special  semi-martingale.  The  subfamily  of  S  consisting  of  all  special  semi- 
martingales  will  be  denoted  bv  S„. 


M.  Yor  is  credited  by  Dellacherie  and  Meyer  with  the  following  example  of  a 
semi-martingale  which  is  not  special.  Start  with  the  probability  space. 
( [0, 1] .H.L),  where  L  denotes  a  complete  Lebesgue  measure  and  A  is  a  positive  ran¬ 
dom  variable.  Define  the  filtration  ( F ( t ) )  on  H  by  setting  F(t)  =  { 0. [0. 1  ] } .  for 
0<t<l,  and  F(t)  =  H,  for  t>l.  Set  X(t)  :=  Al|locj(t).  Then  X  is  an 


increasing  process  and  so  is  a  semi-martingale.  This  process  X  is  a  special  semi- 
martingale  iff  AcLj,  according  to  the  next  Theorem. 

The  Doob-Meyer  Decomposition  shows  that  submartingales  are  contained  in  S  . 
The  following  theorem  sheds  further  light  on  the  structure  of  Sp. 

6.5.4.  Theorem:(Characterization  of  Special  Semi-martingales) 

Let  XcS  and 

X  =  m  +  A.  (8) 

The  following  statements  are  equivalent: 

(1)  If  there  exists  a  decomposition  (8)  with  A  previsible  and 
in  (IV)|oc,  then  X(Sp. 

(2)  There  exists  a  decomposition  (8)  with  A  in  (IV)|oc. 

(8)  Each  decomposition  (8)  satisfies  A  in  (IV)loc. 

(4)  The  increasing  process  X*( t)  —  supt>s  |  X(s)  |  belongs 

to  (fV+liot. 


Remark:  The  decomposition  specified  in  (1)  is  called  the  canonical  decomposi¬ 
tion  of  an  element  of  Sp. 

The  following  Lemma  is  needed  in  the  proof  of  this  Theorem: 

6.5.5.  Lemma: 

(1)  X  is  both  a  local  martingale  and  a  process  of  bounded 
variation  iff  X  is  a  local  martingale  of  local  integrable  varia¬ 
tion. 

(2)  If  XeM0  ]oc  and  is  a  previsible  process  of  bounded  varia¬ 
tion,  then  A’  is  evanescent. 


Remark:  Jacod’s  sensible  way  of  expressing  (1)  of  the  Lemma  is  to  just  write 


Part  (2)  has  the  obvious  consequence  that  the  only  continuous  local  martingales 
of  bounded  variation  are  constant  processes.  In  plain  language,  non-constant, 
continuous  local  martingales  are  of  unbounded  variation  (  have  paths  that  are  of 
unbounded  variation  ). 

To  obtain  some  exercise  with  the  definitions,  we  indicate  the  proof  of  (1).  We 
only  need  to  show  the  inclusion  in  one  direction.  Let  X  belong  to  the  left  side  of 
the  last  equation  and  (Tn)  be  the  localizing  sequence  for  X  as  a  local  martingale. 
Set 

Sn  =  inf(  t  :  j  |  dX(s)  |  >  n  ). 

[o.t] 

We  will  use  the  notation  X(s)  and  X3  interchangeably.  In  any  case,  XT  is  still  the 

process  stopped  at  T.  Then  f  |  dX(s)  |  <  n  +  |  AXST"  |  .  Since 

[o',  t] 

T  T 

I  AXsn"  |  <  n  +  |  Xs;  |  and  min(Sn,  Tn)  t  oo,  as  n— »oo,  we  have  that  X 
belongs  to  (IV)loc. 

The  second  statement  of  the  Lemma  is  an  extension  of  the  same  result  in 
C  hapter  4,  where  the  process  was  a  previsible,  increasing  martingale  vanishing  at 
the  origin.  The  result  here  follows  from  the  first  Theorem  of  the  third  Section  of 
this  Chapter  and  its  Corollary. 

Remark:  (Proof  of  the  Theorem  6.5.4  characterizing  Sp)  Following 

Jacod[l979,p.29],  assume  that  statement  (2)  of  the  theorem  holds  and  so  X  =  m 
4-  A,  with  A  of  local  integrable  variation  (Ae(IV)loc).  We  show  that  (1)  holds. 
Write  X  =  m  +  A  =  m  +  A  -  Ap  +  Ap.  Since  Ac(rV)|oc  we  know’  by 
Theorem  6.3.3  that  A  -  Ap  is  a  local  martingale.  Therefore,  X  =  m'  4-  Ap, 
where  m'  cM|oc  and  Ap  is  a  previsible  process  of  bounded  variation.  But  again  by 
Theorem  6.3.3,  it  follows  then  that  Ap  is  in  (IV)[i0C].  This  takes  care  of  statement 

(1) ,  except  for  uniqueness.  But  this  follows  easily  from  part  (2)  of  the  last 

Lemma.  That  is,  just  assume  that  X  =  m'  +  Ap  has  a  second  representation 

X  =  n  +  B.  Then  the  process  n  -  m'  =  Ap  -  B  satisfies  the  conditions  of  part 

(2)  of  the  Lemma  and  so  is  evanescent.  Therefore  the  representation  is  unique  up 
to  indistinguishability,  and  (1)  holds. 

The  next  step  is  to  show  that  statement  (4)  follows  from  (1).  Let  X  =  m  4-  A  be 
the  "canonical  decomposition"  of  (1)  and  A  *  ( t )  =  supt>s  |  A(s)  |  .  for  all  t>0 
Then  A*fIY|oC,  since  Ac IV|oc.  Let  (Tj  be  the  localizing  sequence  for  m  and 


Sn  :=  inf(t:m*(t)>n),  where  the  *  again  indicates  the  supremum  process.  Then 
we  can  assume  that  Sn|co,  so  that  min(Sn,  TJfec.  Therefore.  m(Sn'Tn)  is  P- 
integrable  (  just  recall  that  m  "  belongs  to  Mu  ).  Hence,  m*(Sn'Tn)  is  bounded 
above  by  n  +  |  m(S  nXn)  I  •  SO  that  m’  is  of  local  integrable  variation.  Since  A*  is 
also  in  this  family  of  processes  we  have  that  X’  is  of  local  integrable  variation 
and  (4)  holds. 

The  remaining  parts,  showing  that  (4)  implies  (3)  and  (3)  implies  (2),  are,  respec¬ 
tively,  straightforward  and  trivial. 

6.5.6.  The  following  Corollary  shows  that  any  semi-martingale  can  be 
transformed  into  a  special  semi-martingale  with  uniformly  bounded  jumps: 

6.5.7.  Corollary: 

Let  X  be  any  semi-martingale,  a>0  and  Xs  the  process  defined  by  setting 

Xa(t)  =  S  AX(s)  1(|  AX(s)|  >a]- 

S<t 

Then  XacBV  and  X  -  Xa  is  a  special  semi-martingale  whose  canonical  decomposi¬ 
tion,  X  -  X*  =  m  +  A,  satisfies  |  dm  |  <  2a  and  |  dA  |  <  a. 

Remark:  Since  this  result  will  allow  a  second  Corollary  that  is  central  to  the  con¬ 
struction  of  the  stochastic  integral,  we  will  give  its  proof:  By  definition  of  semi¬ 
martingale,  X  is  Skorokhod,  so  that  the  paths  t— *Xa(t,w)  have  only  a  finite 
number  of  jumps  in  any  finite  interval  (Section  A  1.1.2).  Consequently.  Xa  is  of 
bounded  variation.  Since  adding  a  process  of  bounded  variation  to  a  semi¬ 
martingale  returns  a  semi-martingale,  Y  :=  X  -  XacS.  By  construction 
j  dY  |  <a.  We  use  this  fact  to  show  that  the  supremum  process  corresponding 
to  Y  is  an  increasing  process  of  local  integrable  variation,  which  by  the  Theorem 
demonstrates  that  Y  is  a  special  semi-martingale.  Set  Tn  =  inf(t:Y*(t)>n). 
Then  we  can  choose  Tnfoo,  for  if  not  then  0<Y*(t)<n0  for  some  n0  and  all  t>0. 
and  then  the  process  is  of  increasing,  integrable  variation,  so  certainly  of  local 
integrable  variation.  On  the  other  hand  when  Tn|oo,  then  0<V(Tn)<n+a,  so 
that  Y*  is  an  increasing  process  which  is  locally  of  integrable  variation.  Y  is 
therefore  a  special  semi-martingale  (6.5.4). 

Let  Y  =  m  +  A  be  the  canonical  decomposition  of  Y.  We  have 
dY  =  dm  -f  dA.  The  idea  of  the  proof  is  as  follows:  We  know  from  Chapter 
4,  on  previsible  projections,  that  pdY  =  pdm  -I-  pdA.  Since  A  is  previsible. 


PAA  =  AA.  It  will  be  shown  below  that  pAm  =  0.  Consequently. 
PAY  =  AA.  Boundedness  of  the  jumps  of  A  follows  from  the  Chapter  4  result 
that  previsible  projections  preserve  order:  |  AY  |  <a  implies  p  |  AY  |  <  pa  =  a. 
This  gives  the  result  that  |  AA  |  <a.  Immediately  then 
| Am |  <  | AY |  +  | AA |  .  so  that  ]  Am  |  <  2a,  and  the  proof  is  complete 

except  for  justifying  pAm  =  0. 

To  see  that  the  previsible  projection  of  the  jump  process  of  m  ( i.e. ,  (Amt,t>0)  ) 
is  evanescent,  let  T  be  a  previsible  time  with  announcing  sequence  (Tn).  Then 
Tn  |  T,  and  Tn  <  T  on  [T  >  0).  Doob's  Optional  Stopping  Theorem  supplies 
the  fact  that  E((m(T)  -  m(Tn))  |  F(Tn))  =  0  so  that  (heuristically)  letting 

n— ►oo,  we  obtain  E((m(T)  -  m(T-))  |  F(T— ))  =  0  on  [0  <  T  <  oo  ],  which 

says  that  the  jump  process  has  an  evanescent  previsible  projection. 

Now  the  much  anticipated  and  important  result. 

6.5.8.  Corollary: 

//  M  is  a  local  martingale,  then  it  has  a  decomposition,  M  =  m'  +  m'  '  . 
where  m1  is  a  local  martingale  vanishing  at  the  origin,  with  uniformly  bounded 
jumps,  |  Am'  |  <  l  and  m'  '  is  a  local  martingale  of  local  integrable  variation. 

6.5.9.  Remark:  This  decomposition  is  not  unique.  Since  local  martingales  are.  of 

course,  semi-martingales,  we  can  apply  the  last  corollary  to  McM)oc.  with  a  =  1/2. 
The  result  is  M  =  Ma  +  m  +  A.  where  m  and  A  are  as  in  the  definition  of  a 

special  semi-martingale,  |  Am  |  <  2a  =  1  and  A  is  previsible  and  locally  of 

integrable  variation.  Set  m  =  m'  and  m'  '  =  Ma  -I-  A  =  M  -  m'  .  Since  Ma 
is  of  bounded  variation,  m  -  m'  =  m'  '  is  of  bounded  variation.  Since  m'  '  is 
also  a  local  martingale,  Lemma  6.5.5  guarantees  us  that  m'  '  is  of  local  integr¬ 
able  variation. 

Remark:  YVe  now  discuss  the  structure  of  the  class  K"d  and  obtain,  as  a  con  se¬ 
quence,  information  about  the  sums  of  jumps  of  any  local  martingale.  Such 
results  are  needed  in  order  to  define  the  quadratic  variation  of  local  martingales 
and  thence  semi-martingales  in  the  next  Section. 

Finally,  we  reformulate  the  previous  decomposition  theorem  for  local  martingales 
into  one  whose  summands  are  continuous  and  purely  discontinuous  local  mar¬ 
tingales.  For  us  this  will  complete  the  geometrical  picture  of  local  martingales  as 
sums  of  orthogonal  processes.  The  mam  purpose  for  including  it  here,  however, 
is  that  it  can  be  used  to  obtain  a  corollary  which  gives  us  the  important  fact  that 
the  continuous  part  of  any  semi-martingale  is  unique. 


6.5.10.  Remarks:  A  rigorous  discussion  on  the  structure  of  the  space  of  purely 
discontinuous  square  integrable  martingales  would  require  more  space  than  is 
appropriate  in  this  note.  But  certain  facts  can  be  explained.  We  start  with  mar¬ 
tingales  which  are  of  bounded  variation.  Let  m  be  such  a  martingale.  Then  we 
can  show  that 


mt  =  m0  + 


0<s<  t 


(  E  Ams 

0<s<t 


where  the  symbol  p  indicates  that  the  last  term  on  the  right  is  the  dual  previsible 
compensator  of  the  sum  of  the  jump  process  t-*Am,  over  the  interval  (O.t ]. 
Thus,  m  is  represented  as  a  sum  of  compensated  jump  martingales. 

The  proof  of  this  statement  is  quite  easy  and  it  also  shows  that  the  compensator 
(  the  dual  previsible  projection  )  is  continuous:  Just  set  X  =  m  -  m0  J,  where 
J  :=  V  Ams.  Then  it  is  clear  that  X  is  continuous  and  so  previsible.  So 

0<s<t 

Xp  =  X.  Also,  since  m  -  m0  is  an  rv'o  martingale,  its  dual  previsible  projection 
is  evanescent  (Theorem  1.6.14).  Therefore,  using  the  linearity  of  the  dual  previsi¬ 
ble  projection  operator,  we  have  that  X  =  Xp  =  -Jp,  which  says  that  Jp  is 
continuous  and  m  =  m0  +  J  -  Jp.  which  is  the  stated  result. 

It  turns  out  that  such  martingales  are  dense  in  K2d.  The  usual  way  to  establish 
this  fact  (Meyer  [1976))  is  to  let  T  be  a  stopping  time  and  define  the  subspace 
M[T]  of  K‘d  to  be  those  martingales  which  are  continuous  outside  of  the  graph  of 
T.  In  order  to  state  the  basic  results,  first  consider  the  case  where  T  =  0,  a.s.P 
(remember  that  we  are  still  under  the  “usual  conditions",  so  T  is  indistinguish¬ 
able  from  the  zero  stopping  time).  If  mcM[0],  then  m  -  m0  is  a  square  integrable. 
purely  discontinuous  martingale  which  is  also  continuous.  Therefore,  m-m0  must 
be  the  zero  martingale,  and  so  mt  is  the  constant  martingale  equal  to  the  random 
variable  m0  for  all  t>0.  (Remember  the  convention  stated  in  the  Outline,  which 
set  m0_  =  0,  so  that  all  members  of  the  space  of  continuous  martingales  must 
satisfy  the  condition  m0=0.) 

Therefore,  suppose  from  now  on  that  T  >  0,  a.s.P.  So  if  mcM[T],  we  must  have 
m0  =  m0_  =  0  and  so  M[T]C  X0d,  when  T  >  0,  a.s.P. 


Now,  let  m  :=  A  -  Ap,  where  A  :=  g  1  [jt.oo))'  where  g  is  a  random  variable 
with  finite  second  moment,  so  that  A  is  in  IV  and  m  is  a  martingale  (Chapter  4). 

a  compensated  jump  martingale. 


There  are  two  cases  to  treat:  (i)  T  totally  inaccessible  and  (ii)  T  previsible. 

Consider  case  (i):  Since  Ap  is  previsible,  its  discontinuities,  if  any,  are  exhausted 
by  a  sequence  of  previsible  times  and  by  definition  (  of  the  term  "exhaust". 
Chapter  2  )  it  cannot  charge  any  other  stopping  times;  in  particular,  it  cannot 
charge  T,  since  T  is  totally  inaccessible.  Further,  we  now  show  that  Ap  cannot 
even  charge  any  previsible  time. 

To  see  this,  recall  the  language  of  Chapter  4.  and  the  fact  that  the  measures  gen¬ 
erated  by  A  and  Ap  agree  on  G(PT),  the  cr-algebra  of  previsible  events.  Then 
notice  that  here,  the  support  of  the  measure  is  [[T]].  So  if  Ap  charges  a  prev¬ 
isible  time,  U,  then  the  random  set  [[FT]]p|[[T]]  is  not  evanescent  and  so  T  is  not 
totally  inaccessible.  This  contradiction  therefore  tells  us  that  Ap  does  not  charge 
any  previsible  time.  Hence,  Ap  must  be  continuous  and  so  m  is  continuous  out¬ 
side  of  the  graph  of  T. 

Finally  in  case  (i),  with  more  effort  than  we  want  to  expend  here,  it  can  be  shown 
that  A^fLo,  so  that  this  compensated  sum  of  jumps  martingale  is  square  integr¬ 
ate.  Therefore,  mcM[T]CK02,p. 

Now  in  case  (ii),  with  T  previsible  and  a.s.P.  positive,  and  with  the  additional 
assumption  on  g  that  E(g  |  F(T))  —  0,  it  can  be  shown  that  (g  1  [[t,oo)))p  =  0 
Hence  the  compensated  jump  martingale,  m,  has  the  form  m  =  g  1[[t,co))  anc* 
belongs  to  M[T). 

Therefore,  with  either  of  the  assumptions  on  T  and  the  corresponding  assump¬ 
tions  on  g,  m  =  g  1  [[x.co ))  _  (S  l[|T,x4))P  (called  a  compensated  jump  mar¬ 
tingale  )  is  a  martingale  in  M[Tj. 

Further,  it  can  be  shown  that  for  every  n,  ncK2,  the  process 

L  =  mn  -  AmTAnTl[|T 

is  a  uniformly  integrable  martingale  which  vanishes  at  the  origin  (  belongs  to 

M0). 

When  n  is  continuous  at  T,  this  shows  that  mneM0,  so  that  m  is  orthogonal  to 
every  n  which  is  continuous  at  T.  Since  our  compensated  jump  martingale  is  in 
K‘,  we  also  have  m2  -  (AmT)2  l|jT  From  this  and  the  properties  of  uni¬ 

formly  integrable  martingales, 


We  now  apply  these  observations  to  an  arbitrary  n  in  k".  Set 
m  =  Anx  1[[X  ,c)j  -  (Anx  1[[X  Then  m  is  a  compensated  jump  martingale 

with  the  property  that  n  -  m  is  continuous  at  T,  and  consequently,  orthogonal  to 
\t[Tj.  m  is  therefore  the  projection  of  n  onto  m[Tj. 

We  can  now  state  the  principal  result  concerning  the  structure  of  purely  discon¬ 
tinuous,  square  integrable  martingales. 

6.5.11.  Theorem: 

If  meK02  d,  then  m  is  the  sum  of  a  senes  of  compensated  jump  martingales  and  m 
is  orthogonal  to  every  martingale  ncK2  which  does  not  charge  a  jump  of  m  (  so  m 
is  orthogonal  to  every  member  o/K2,c/ 


Remarks:  By  a  Theorem  in  Chapter  2  there  is  a  sequence  of  stopping  times,  (Tn). 
that  exhaust  the  jumps  of  m.  (Recall  that  the  definition  of  exhaust  includes  the 
fact  that  the  graphs  of  these  stopping  times  are  pairwise  disjoint.)  Further,  since 
each  stopping  time  can  be  decomposed  into  the  sum  of  a  totally  inaccessible  and 
an  accessible  stopping  time  (whose  graphs  are  disjoint),  and  by  definition  each 
accessible  time  is  included  in  the  union  of  a  sequence  of  previsible  times,  we  can 
assume  that  each  Tn  is  either  totally  inaccessible  or  previsible. 

For  each  k,  let  be  the  compensated  jump  martingale  associated  with  the 
stopping  time  Tk.  Since  the  graphs,  [[TJ],  are  pairwise  disjoint,  the  m(k*  are 
pairwise  orthogonal  (in  L2(P),  if  you  like). 

Letting  :=  A  m^,  we  have  that  m  -  is  continuous  at  the  stopping 

i 

times  T,,  •  •  •  ,Tk  and,  therefore,  orthogonal  to  m'A  ■  ■  •  ,m*kl  It  follows  that 
m  -  is  orthogonal  to  U^. 

Therefore,  if  we  write  m  =  UM  +  (m  -  U<k>),  square  both  sides  and  then  take 
expectations  of  the  result  we  have 

Em£  =  + 

I 

=  S  E(AmT.)2  +  E(m„,  -  U™)2- 

i 

(We  have  used  equation  (9).)  It  follows  that  U/ki  converges  to  an  element  V  of 


A' -fo*.  Vf  -..iv 


K"d.  (Recall  a  previous  Lemma  stating  that  K2'd  is  closed.)  It  is  a  simple  matter 
to  conclude  therefore  that  m  -  CfK2,e,  and  m  -  U  is  orthogonal  to  U.  But  since  m 
is  purely  discontinuous,  it  follows  that  m  -  U  is  orthogonal  to  itself.  Hence, 
m  =  l’,  and  so 

Em^  =  ^  E(AmTJ2. 

1 

This  completes  the  “proof”  of  the  Theorem. 

Now  if  we  take  any  element  in  K2,  not  necessarily  continuous  or  purely  discon¬ 
tinuous,  then  we  can  carry  out  the  same  construction  and  write 
m  =  U  -)-  (m  -  U).  Therefore,  we  have  a  unique  decomposition  of  m  into  its 
“continuous”  and  “  purely  discontinuous”  parts.  This  decomposition  also  yields, 
as  before, 

00 

Em2  =  VEtArnTZ  +  Elm^-lU2. 

1 

But  now  m  is  not  necessarily  equal  to  U,  so 

00 

Em2  >  £E(AmTn)2f  (10) 

l 

with  equality  holding  iff  mcK2,d. 

Returning  to  the  decomposition  of  mcK2,  and  writing  mc  =  m  -  U  and 
md  =  U,  we  can  say  that  there  exist  mTK2c,  mdcK2,d  such  that  m  is  uniquely 
decomposed  into  the  sum  m  =  m2  +  md. 

This  is  a  essentially  a  special  case  of  a  more  general  result  about  local  mar¬ 
tingales.  To  prove  the  more  general  statement  directly,  without  the  last  version, 
Jacod  first  notes  that  any  local  martingale  of  bounded  variation  is  in  the  family 
M|dc  and,  using  the  decomposition  given  in  Corollary  6.5.8  above,  reduces  the 
proof  of  the  decomposition  of  local  martingales  into  their  continuous  and  purely 
discontinous  parts  to  a  proof  of  this  statement  for  members  of  K02. 


W  e  state  this  result: 


6.5.12.  Theorem: 

Let  m  be  a  local  martingale.  Then  there  exist  martingales  mc  and  m11  in  Mi'c  and 
respectively,  such  that  m  =  mc  +  md  .  This  decomposition  of  m  into  i its 
continuous  and  discontinuous  parts  is  unique. 

Remark:  We  have  already  mentioned  the  first  of  the  following  two  results.  The 
second  will  be  needed  in  the  construction  of  the  stochastic  integral  for  local  mar¬ 
tingales. 

(1)  Any  local  martingale  of  bounded  variation  is  in  the  family 

M|«- 

(2)  Any  two  members  of  M|oC  which  have  indistinguishable 
jumps  processes  are  indistinguishable.  The  latter  means  sim¬ 
ply  that  Am  =  An  implies  m  =  n  for  m.ncM^,.. 

Remark:  For  the  second  statement,  let  X  =  m  -  n.  Then  the  hypothesis  of  (2) 
says  that  the  jump  process  of  X  is  the  zero  process.  By  unicity  of  the  decomposi¬ 
tion  theorem,  X  is  then  a  continuous  process  which  takes  the  value  zero  at  the 
origin.  Since  X  is  purely  discontinuous,  this  means  that  X  is  the  zero  process,  or 
what  is  the  same,  m  =  n. 

6.5.13.  Corollary 

Let  any  semi-martingale,  X,  have  the  representations 

X  =  m  +  A  and  X  =  n  +  B 
where  m,n  «  M0  )oc  and  A,B  e  BV.  Then  mc  =  nc. 

Remark:  Just  note  that  m-n  =  B-AC  Molloc- 
Therefore,  0  =  (m  -  n)c  =  mc  -  nc. 

If  the  semi-martingale  X  decomposes  as  X  =  m  4-  A  ,  then  we  write  Xc  =  mc 
and  call  Xc  the  continuous  part  of  the  semi-martingale.  By  the  Corollary,  Xr  is 
independent  of  the  decomposition.  If  XcM|0C,  we  set  Xc  =  nT  where  nT  is  given 
by  the  decomposition  m  =  mc  +  md  of  the  theorem  itself. 


Now,  return  to  the  inequality  (10).  This  immediately  yields  the  following. 


6.5.14.  Theorem 

//mfK2,  Men  ^Am,2  <  oo,  a. s.P,  for  all  t>0. 

S<t 

Remark:  Following  custom,  we  have  used  the  following  abbreviation: 

Am,2  :=  (Am,)2. 

The  extension  of  this  result  to  local  martingales  is  an  immediate  consequence  of 
applying  this  theorem  to  the  decomposition  given  in  Corollary  6.5.8  and  using  an 
obvious  property  (explicitly  stated  below)  of  processes  of  integrable  variation. 

The  following  Corollary  is  necessary  to  prove  the  existence  of  the  quadratic  varia¬ 
tion  of  a  semi-martingale: 

6.5.15.  Corollary: 

jymeM|oc,  then  V)  Am,2  <  oo,  a. s.P,  for  all  t>0. 

S<  t 

6.5.16.  Corollary: 

IfXe S,  X  =  m  +  A,  then  AX,2  <  oo,  a. s.P,  for  all  t>0. 

3<t 

Remark:  Since  A  is  in  BV, 

V)AA,2  <  C£A|A,|  <  oo 

s<t  s<t 

a.s.P,  for  some  positive  constant  C,  for  all  t>0.  The  result  follows  from  the  pre¬ 
vious  Corollary  by  noting  that 

AX,2  <  2(  Am,2  +  AA,2  ). 

We  will  now  give  a  result  of  Jacod  which  says  that  localization  does  not  extend 
semi-martingales. 

6.5.17.  Theorem: 

(1)  Sp  is  not  extended  by  localization:  Sp  =  (Sp)|oc. 

(2)  S  is  not  extended  by  localization  :  S  =  Sloe. 

Remark:  We  will  only  prove  (1).  The  main  purpose  is  to  illustrate  what  Del- 
lacherie  and  Meyer  (1981]  call  "pasting”:  a  procedure  for  constructing  a  single 
process  from  segments  of  a  sequence  of  processes. 


As  always,  Sp  C  (Sp)|oc.  So  let  Xf(Sp)io<.  and  (Tn)  be  a  localizing  sequence  of  X 
which  reduces  X  to  Sp,  that  is,  such  that  X  VSp  for  each  n.  Let  the  canonical 
decomposition  be  X1"  =  nAn*  +  for  each  n.  Since  the  Tn  are  nondecreas¬ 
ing.  Tn+1*Tn  =  Tn  ,  so  that  (XT""‘)T"  =  XT".  The  uniqueness  of  the  canoni¬ 
cal  decomposition  allows  the  summands  of  the  decomposition  to  inherit  this  pro 
perty: 

(m(n+D)T«  =  m(n) 

(A(n+1))T"  =  A,n). 

The  required  local  martingale,  m,  and  previsible  process  of  bounded  variation,  A. 
are  obtained  by  pasting  these  path  segments  together,  path  by  path,  over  all 
paths.  Geometrically,  it  might  help  the  reader  to  realize  that  equation  (9),  for 
instance,  means  that  on  [0,  Tn(w)],  m*n'(w)  =  m*n+1'(w).  Thus,  m  and  A  with 
the  required  properties  exist  such  mT"  =  m*n'  and  AT"  =  A*n*.  and 
X  =  m  +  A.  Therefore,  XcSp)  and  so  Sp  =  (Sp)loc. 

Remark:  Thus,  we  have  reached  the  end  of  the  line  in  extending  our  processes  by 
localization.  That  this  is  exactly  the  right  place  to  stop  in  order  to  develop  the 
stochastic  integral  will  only  be  apparent  after  we  complete  the  construction  of 
the  stochastic  integral. 

6.5.18.  Remark:  In  Chapter  5  we  gave  a  very  general  form  of  the  Doob-Meyer 
Decomposition  Theorem.  We  will  now  state  this  important  result  in  a  more  res¬ 
trictive  and  more  easily  proved  form  (see  for  example  Ikeda  and  Watanabe). 
Then,  using  this  and  the  results  developed  so  far  in  this  Section,  we  prove  the 
Theorem  as  stated  in  Chapter  5.  This  will  to  some  extent  explain  the  central  role 
this  result  plays  in  the  modern  theory  of  semi-martingales. 

6.5.19.  Lemma  (Doob-Meyer  Decomposition  for  Class  D  submar¬ 
tingales): 

Let  X  be  a  supermartingale  of  the  class  D.  Then  there  exists  a  unique,  previsible. 
integrable  increasing  process ,  A,  such  that  X  4-  A  eM0  (  is  a  uniformly  integrable 
martingale  which  vanishes  at  the  origin  ).  Further,  A  is  continuous  iff  A  is 
quasi-left  continuous. 

6.5.20.  Remark:  Note  that  the  Lemma  does  not  involve  localization.  With 
Theorem  6.5.10  and  Lemma  6.5.12,  we  can  now  state  and  prove  the  DMD 


Theorem  in  a  form  equivalent  to  that  of  Chapter  5: 

6.5.21.  Theorem  (Doob-Meyer  Decomposition): 

Every  supermartingale  (submartingale )  is  a  special  semi-martingale. 

Remark:  Because  the  proof  (Jacod)  is  very  elegant  and  gives  application  of  some 
basic  martingale  results,  we  will  give  its  outline.  Set  Tn  :=  inf{t  :  |  X(t)  >  n  } 
and  Sn  :  =  min(n,Tn).  Let  F  =  ( F( t ) )  be  the  underlying  filtration.  For  each  n. 
consider  the  stopped  process,  X",  and  notice  that  since  X  is  an  F-supermartingale. 
when  t>n,  X^  =  E(Xn  |  Ft),  and  when  n>t,  then  Xt  >  E(Xn  |  Ft).  These 

two  statements  can  be  combined  by  writing  Xtn  >  Yt^  .=  E(Xn  |  Ft  );  thus, 

for  each  n,  the  uniformly  integrable  martingale,  Y'n*  is  a  minorant  for  the 
stopped  process  X".  Therefore,  Doob's  Optional  Sampling  Theorem  applies  to  the 
stopped  process,  X  ".  That  is,  the  process  X  "  =  ( Xn )  "  is  an  F- 

supermartingale.  Further,  since  this  process  is  majorized,  for  each  n.  by  the  ran¬ 

dom  variable  n  +  (XsnJ+,  it  is  a  class  D  supermartingale.  The  previously  stated 
class  D  form  of  the  Doob-Meyer  decomposition  theorem  then  applies  and  Xs"  is  a 
special  semi-martingale.  Since  we  know  that,  the  class  of  special  semi-martingales 
is  closed  under  localization,  we  have  that  X  is  a  special  semi-martingale. 

The  following  also  holds: 

6.5.22.  Theorem: 

Every  special  semi-martingale  is  the  difference  of  two  local  supermartingales  (sub¬ 
martingales). 

Doob’s  class  D  Lemma  also  applies  to  submartingales.  The  only  change  would  be 
that  we  would  have  X  =  m  +  A  with  A  increasing. 

6.6.  The  Quadratic  Variation  Processes  of  a  Semi-Martingale:  In  various 
forms,  we  have  mentioned  that  if  m  is  a  square  integrable  martingale,  then  m2  is 
in  the  class  D  and,  hence  by  Doob's  Theorem,  a  previsible,  increasing  process  of 
integrable  variation  exists  which  compensates  m2  into  a  martingale.  In  Chapter  1 
we  denoted  this  increasing  process  by  <m,m>. 

It  may  have  escaped  notice,  but  we  have  also  proved  this  for  the  family  of  mar¬ 
tingales,  m,  that  are  only  locally  square  integrable.  If  m  c  K|2C,  then  nr  is  a  spe¬ 
cial  semi-martingale.  This  is  because  mcK|2c  — *■  m2t(Sp)|oc  =  Sp,  by  Theorem 
6.5.10  Letting  the  associated  previsible  process,  A,  of  the  canonical  decomposition 
be  denoted  by  <m,m>  and  observing  that  it  is  an  increasing  process  since  m2  is 
a  local  submartingale,  we  have 


6.6.1.  Lemma: 

If  m  is  a  locally  square  integrable  martingale,  then  there  exists  a  locally  integr- 
able.  increasing,  previsible  process.  such  that  m"  -  <m,m>  c  M0]oc. 

As  in  Chapter  1,  we  call  process  <m,m>  =  t  f  R+  )  the  previsi¬ 

ble  quadratic  variation  of  m.  In  our  notation,  <m,m>c(IV+)|0C. 

When  m  and  n  belong  to  K|“c,  the  process  <m,n>  is  defined  by  polarization,  as 
in  Chapter  1.  It  is  immediate  that  the  mappings  m  — ♦  <m.n>  and 
n  — *  <m,n>  are  linear.  Indeed,  for  m,n,l,k  in  Kj2c  and  a,b,c,d  real  numbers. 

<am+bn,cl+dk>  =  ac<m,l>  +  ad<m,k>  +  bc<n,l>  -f  bd<n,k>. 

6.6.2.  Theorem: 

If  m,n  £  Ki"c,  then  <m,n>  is  the  unique  previsible  process  in  (IV)loc  such  that 
mn-<m,n>  belongs  fo\l0|oc. 

6.6.3.  Remark:  It  might  be  worthwhile  to  first  consider  the  case  where  m  and  n 
are  square  integrable  martingales  (m,nfK2).  The  Theorem  then  follows  by  noting 
that  (m+n)2  -  <m+n,m+n>  is  a  martingale  and  equals  the  sum  of  the  following 
three  martingales: 

m2  -  <m,m>,  n2  -  <n,n>,  2{mu  -  (<m+n,m+n>  -  <m,m>  -  <n,n>)/2}. 

so  that  the  last  term  in  the  braces  must  a  martingale.  The  conclusion  that 
mn  -  <m,n>  is  a  martingale  follows  from  the  uniqueness  of  the  Doob-Meyer 
decomposition. 

The  reader  should  note  that  the  conclusion  of  the  remark  says  that  if  you  start 
with  martingales  you  end  up  with  a  martingale,  not  just  a  local  martingale. 

A  proof  (Jacod  [1979])  that  gives  the  generality  of  the  Theorem  follows  by  recog¬ 
nizing  the  product  mn  as  a  special  semi-martingale.  This  is  because  writing 

mn  =  -j((m+n)2  -  (m-n)2  ) 

expresses  the  product  mn  as  the  difference  of  two  submartingales.  Hence,  mn  is 
in  Sp,  by  Theorem  6.5.15. 

As  in  Chapter  1,  the  process,  <m,n>,  is  called  the  covariance  process  of  m 


6.6.1.  Remark:  Again  take  ni  and  n  to  be  square  integrable  martingales.  An 
easy  computation,  based  on  the  last  Remark,  yields 

E{m(t)n(t)  |  F(s)}  -  m(s)n(s)  =  E{<m,n>(t)  -<m,n>(s)  |  F(s) } . 

This  equation  states  that  the  product  mn  is  a  martingale  iff  <m,n> (t)  =  0  for 
all  t>0.  Recall  the  earlier  discussion  on  orthogonality  of  martingales  and  store 
for  later  purposes  the  fact  that  <m,n>  =  0  if  m  is  continuous  and  n  is  a  com¬ 
pensated  jump  martingale. 

6.6.5.  We  now  define  the  (optional)  quadratic  variation  and  the  (optional) 
cross  quadratic  variation  of  semi-martingales,  X  and  Y. 

[X,X]t  :=  <XC  ,  V  >t  +  £  (AX(s))2  (11) 

0<s<t 

[X,Yjt  :=  <XC,  Yc>t+  £  AX(s)AY(s)  (11.1) 

0<s<t 

for  all  t  in  R+. 

6.6.6.  Remark:  From  the  definition  of  <  ,  >  on  K[2C,  <m,n>  is  well-defined 

for  m  and  n  in  M^,.,  since  M(£c  C  K[2C.  Section  5.1.11  gives  a  proof  of  the  fact 

that  any  continuous  local  martingale  is  an  Lp  local  martingale,  for  any  p  >  1. 

Hence,  the  first  terms  on  the  right  side  of  equations  (11)  and  (11.1)  are  well- 
defined.  Corollary  6.5.16  of  the  last  Section  then  shows  that  [X,X]  is  well-defined. 
It  follows  easily  that  [X,Y]  makes  sense. 

Having  observed  that  M|oC  C  Kj“c,  the  following  example  due  to  C.  Strieker  (Del- 
lacherie  and  Meyer  [1980])  of  a  local  martingale  that  is  not  locally  square  integr¬ 
able  is  probably  worth  the  interruption.  Define  the  filtration  (F( t ) )  on  the  proba¬ 
bility  space,  (0,H,P),  as  F(t)  =  {0,H},  for  0<t<I,  and  F(t)  =  H.  for  t>l. 
Then  X(t)  =  E(h|F(t)),  where  hc(Lj  -  L2  ),  is  such  an  example. 

6.6.7.  It  can  be  shown  that  <XC,XC>  is  always  continuous.  (  Actually,  the  prev- 
isible  compensator  of  the  submartingale  in  the  DMD  Theorem  is  continuous  iff 
the  process  is  quasi-left  continuous,  which  is  true  if  the  process  is  continuous.) 
Therefore,  [X,X]  will  be  a  continuous  process  iff  the  sum  vanishes,  that  is,  iff  X  is 
continuous. 


Of  course,  in  the  cross  quadratic  variation,  the  sum  is  the  zero  process  if  X  and  Y 
have  no  common  jumps  and  clearly  [X,Y]  =  0  if  one  of  the  factors  is  continuous 
and  the  other  is  purely  discontinuous,  since  then  <XC,YC>  =  0. 

We  will  not  take  time  to  prove  the  following  important  Theorem. 

6.6.8.  Theorems  Let  m  and  n  be  local  martingales.  Then 

(1)  [m,nj  is  a  process  of  bounded  variation  and 

(2)  mn  -  [m,n]  c  M0,iOc 

6.6.9.  Remark:  We  see  from  this  Theorem  that  m  and  n  are  (strongly)  orthogo¬ 
nal  iff  [m.n]  is  a  member  of  M0  loc,  f°r  then  mnfM0  |oc. 

Our  main  purpose,  however,  in  stating  this  Theorem  is  that  it  is  but  a  short  step 
to  the  result  that  [X,X]  is  a  member  of  V+.  For  let  X  =  m  +  A,  with  m  in 
M0  ioc  and  A  in  BV.  Then,  as  shown  in  the  proof  of  Corollary  6.5.16,  the  series 
with  terms  (AA)2  converges,  so  that  the  process  defined  by  V  (AA(s))2 

0<s<t 

belongs  to  V+.  By  (1)  of  the  previous  Theorem,  [m,m]  is  in  V+.  Hence, 
[X,X]  c  V+,  and  we  have  the  following  Theorem. 

6.6.10  Theorem: 

If  X  is  a  semi-martingale,  then  [X,X]  is  an  increasing  process. 

6.6.11.  Remark:  We  will  list  a  few  of  the  consequences  stemming  from  this 
result.  Let  X  and  Y  be  semi-martingales.  Then 

(1)  (X,Y]<(BV); 

(2)  (X,Y)  — »  [X,Y]  is  bilinear; 

(3)  If  T  is  a  stopping  time,  then 

[X,Y]t  =  [Xt,Yt]  =  [XT,Y]  =  [X,Yt] 

6.6.12.  Remark:  If  m,nfK|2c,  then  we  have  seen  that  mn  -  <m,n>  and 
mn  -  [m.nj  both  belong  to  M0  (oc.  Therefore,  [m,n]  -  <m,n>fM0loc  also.  Since 
[m,n]f([V)|0(.,  we  have  <m,n>  =  [m,n]p.  That  is,  <m,n>  is  the  previsible 
projection  of  [m,n].  For  X,Y<S,  we  therefore  can  extend  the  definition  of  <  .  > 
by  setting  <X,Y>  :=  [X,Y]P,  whenever  [X,Yjf(rV)|oc. 


6.6.13.  As  might  be  expected,  to  complete  the  construction  of  the  stochastic 
integral  we  require  inequalities  analogous  to  the  Cauchy-Schwartz  inequality.  The 
form  of  the  factors  on  the  right  side  of  the  second  of  these  inequalities  should  be 
noted  in  order  to  understand  the  selection  of  a  norm  for  Lp(m),  defined  below. 
The  following  is  due  to  Kunita  and  Watanabe  [1967]: 

6.6.14.  Theorem  (Kunita- Watanabe  Inequality): 

//  H  and  K  are  optional  processes  and  m,neK",  then 


X  CO  _1_  CO  _1_ 

/  |  H(s)  I  I  K(s)  I  1  d[m,n]s  [  <  ( / H2  (s)  d[m,m]s) 2  ( / K2  (s)  d[n,n]s) 2 

o  oo 

//p>l  and  q  is  the  conjugate  of  p.  then 


E(  /  |  H(s)  |  |  K(s)  |  |  d[m,n]s  |  )  <  HlH2  .[m,m])<£l|p  ll(K2  .[n,n])J||q. 


Remark.  If  n  and  m  are  continuous,  then  we  can  replace  [  .  ]  with  <  ,  >.  In 
fact,  the  inequality  was  originally  proved  in  terms  of  <  ,  >. 

The  following  remarkable  Theorem  shows  that  with  p>l  the  norm  ||  m‘(oc)  ||p 

and  the  norm  ||  [m,m]2(oc)  ||  are  equivalent.  In  particular,  this  means  that  we 
can  define  the  space  K2  of  square  integrable  processes  in  terms  of  the  L2  norm  of 
v/(m,m]. 

6.6.15.  Theorem  (Davis,  Burkholder,  Grundy): 

Let  pe[l,oo).  Then  there  exist  positive  constants  cp  and  cp  such  that  for  each 

rncMioe 


cp  Im^llp  <  1  [m.m]*,  flp  <  cp  ||  m^  ||p. 


6.7.  Stochastic  Integrals  Relative  to  Continuous  Local  Martingales:  Let 

m  be  a  local  martingale  and  pe[l,oc).  As  usual  ||.||p  denotes  the  Lp  norm: 
Ifllpp  :=  E(  |f|P).  Set 


||  H  |p>m  :=  II  (H2  .|m,m](oc))2  ||p 

and  Lp  (m)  :=  {  H  previsible  :  ]|  H  ||p  m  <  co  }.  The  Kunita  and 
Watanabe  Inequality  shows  that  Lp  (m)  C  Lq  (m)  if  q  <  p. 

6.7.1.  After  our  discussion  on  localization  of  integrators  in  the  introduction  to 
this  Chapter,  we  noted  that  localization  for  integrands  would  be  carried  out 
differently  than  for  integrators.  Let  pf[l,oo),  and  Lpioc(m)  denote  the  set  of  all 
previsible  processes  for  which  there  exists  an  increasing  sequence,  Tn|oo,  of 
optional  times  such  that  H  1  [[o,Tn  ]]  e  Lp  (m)-  Notice  that  this  type  of  localization 
is  a  natural  choice  for  integrands.  Attempting  to  integrate  constants  other  than 
zero  over  unbounded  sets  relative  to  <r-6nite  measures  tends  to  produce  undesir¬ 
able  results. 

6.7.2.  Suppose  that  meM0cloc.  Then  the  Lebesgue-Stieltjes  stochastic  integral 
H2.[m,m]  is  continuous,  and  the  increasing  processes  t-*vH2.[m,m](t)  is  continu¬ 
ous  and  vanishes  at  the  origin.  Therefore,  this  process  is  of  bounded  variation  iff 
it  is  locally  bounded.  Therefore,  under  the  assumption  that  mfM0cloc,  it  can  be 
shown  that  Lp  loc(m)  =  Lt  )oc(m)  for  all  p>  1. 

To  define  the  stochastic  integral  for  H  in  L1  loc(m)  relative  to  meM0c|oc  it  is  there¬ 
fore  sufficient  to  define  it  for  L2  (m). 

6.7.3.  Let  HcL2  (m).  Consider  the  linear  transformation  on  K2  defined  by 

n  — *•  C(n)  :=  E((H.[m,n])(co)).  (12) 


Set 


11  n  III  =  II  n(oc)  ||2, 

the  norm  on  the  Hilbert  space,  K2,  equipped  with  the  inner  product 
(m.n)— ♦Em(oc)n(oo).  Then  according  to  the  Kunita-Watanabe  Inequality  with  K 
=  1  and  using  equivalence  of  norms,  we  have  that  |  C'(n)  |  is  bounded  above  by 
II  H  ||om  III  n  |||.  This  shows  that  C  is  continuous  on  K2. 

But  as  a  continuous,  linear  functional  on  a  Hilbert  space,  there  exists  a  unique 
process  YfK2  with  the  property  that  E(Y(oc  )n(oc))  =  C(n)  for  all  K2.  R  ecalling 
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equation  (12),  this  remark  justifies  the  following 

6.7.4.  Definition:  If  HcL2(m),  we  call  the  stochastic  integral  of  H  relative  to 
m(M0C|oc,  denoted  by  H.m,  the  unique  element  of  K2  such  that 

EUH.m)^.)  n^)  =  E(H.[m,n])TO,  (13) 

for  all  n  in  K2. 

Having  justified  the  definition  of  the  stochastic  integral,  we  now  give  a  result 
which  characterizes  it. 

6.7.5.  Theorem  (Characterization  of  H.m  on  M0cioc): 

//HcL2(m)  and  mcM0C|oc,  then  H.mcK02,c  and  H.m  is  the  unique  element  of  K2 
such  that 

[H.m,  n]  =  H.[m,  n],  (14) 

for  all  ncK2. 

6.7.6.  Remark:  Thus,  if  Y  is  a  solution  of  [Y,n]=H.[m,n],  an  equation  equating  a 
process  of  bounded  variation  and  a  Lebesgue-Stieltjes  integral,  then  1"  =  H.m  . 

t 

If  we  define  f  H(s)  dm(s)  =  (H.m)(t),  for  all  t>0,  the  equation  (14)  takes  on  the 
o 

following  form: 

t 

[/  H(s)  dm(s),  n](t)  =  /  H(s)  d[m,n](s).  (14.1) 

o  o 


6.7.7.  Remark:  This  Theorem  is  due  to  Kunita  and  Watanabe.  The  proof  given 
here  is  from  Jacod.  From  earlier  remarks,  we  know  that  Y  =  H.mcK2.  Letting 
C  be  as  defined  above,  C(Yd)  =  E(Y(oo)Yd  (oo))  and,  since  the  discrete  and 
continuous  parts  of  Y  are  orthogonal,  we  have 

C’(Yd)  =  E((Ye  ( oc)  +  Yd  (oc))  Yd  (oo))  =  E((Yd  (oc))2). 
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But  since  m  is  continuous,  we  have  [m,  Yd  j  =  0,  the  zero  process.  Therefore, 
by  definition  of  stochastic  integral  C(Yd)  =  E(  H.[m,Yd])  =  0  and  conse¬ 
quently  E((Yd(oc))2)  =  0.  Therefore,  Yd  =  0,  the  zero  process  (it  is  clear  that 
we  are  working  with  equivalence  classes),  and  so  Y  is  continuous  and  Y0  =  0 
(the  latter  with  the  convention  Y0_  =  0).  That  is,  YcK02,c. 

Next,  for  each  optional  time,  T,  and  each  ndx2  we  know  that 
Y  nT  -  [Y.nT]  cKq1  and  [Y,nT]  =  [Y,n|T;  hence, 


E([Y.n](T))  =  E  Y(T)  n(T)  =  E(E(Y(oo)  |  FT»  n(T» 

=  E(Y(oo)  nT(oo))  =  C'(nT(oc))  =  E(H.[m,nT]  (oc) 
=  E(H.[m,n]T  (oc))  =  E(H.[m,n]  (T)). 

That  is. 


E[Y,n](T)  =  EH.[m,n](T) 
for  any  ncK2  and  optional  time  T. 


It  follows  from  Theorem  6.6.8  that  [Y.  n]  -  H.[m,n](M0.  But  [Y.  n]  -  H.[m.nj 
is  a  process  of  bounded  variation  which  is  previsible  (the  latter  since  Y  and  m 
are  continuous),  so  that  [Y,  n]  -  H.[m,n]  =  0,  the  zero  process.  This  proves 
the  theorem  in  one  direction. 


Conversely,  let  YfK2  and  satisfy  [Y.  n]  =  H.[m,n]  for  all  ndv2.  Then 
E  [Y,n](oc)  =  E  (H.[m,n](oc))  =  C(n)  =  E(Y(co)n(oo).  Therefore, 
E[Y.n](oc)  =  E(Y(oo)n(oo))  for  all  ncK2.  Then,  by  definition.  Y  =  H.m,  com¬ 
pleting  the  proof. 


Remark:  We  have  observed  that  when  m  is  continuous,  H.m  is  continuous. 
Hence,  [H.m, a]  =  <H.m,n>  and  we  remarked  earlier  that  m  continuous  gave 
us  [m,n]  =  <m,n>  so  under  the  assumption  of  the  theorem,  equation  (14)  can 
be  expressed  as 


<H.m,n>=H.<m,n> . 


(14.2) 


Further,  from  the  properties  of  [  ,  ],  we  can  show 
6.7.8.  Corollary: 

(H.m)T  =  H.mT  =  H  1  |[0,t]] ■ m  for  all  optional  times  T. 
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6.7.0.  Remark:  Let  ILL,  k,c( m)  and  TJcc.  a  localization  of  H  relative  to  L.:(m). 

(  No,  the  2  is  not  a  mistake;  recall  that  Ll  loc  =  Loloc.)  That  is,  II  1  [[0.Tn]]cL.j( m ) . 
The  previous  Corollary  provides  us  with  a  way  of  extending  the  definition  of  H.m 
to  II  in  Lllor  by  setting  (H.m)  n  =  (H  l|j0 .T„]))  m.  for  each  n.  The  result  is 
called  the  stochastic  integral  of  H  relative  to  m  .  Thus,  the  stochastic 
integral  has  been  defined  for  I fe Lj  ,oc  and  m«NI0joc.  It  satisfies  a  characterization 
analogous  to  that  stated  in  the  last  Theorem  and  the  same  equations  as  given  in 
the  Corollary. 

6.8.  Stochastic  Integrals  Relative  to  Local  Martingales: 

6.8.1.  Definition;  If  ILL,  ,oc( m)  and  ni£M|oc.  then  the  stochastic  integral  H.m 
of  H  relative  to  m  is  the  unique  element  of  Mloc  which  satisfies 
(H.m)c  =  H.mc.  A(H.m)  =  HAm. 

6.8.2.  Remark:  Recall  that  if  m  is  a  continuous  local  martingale  then  the  paths 
of  m  are  of  unbounded  variation.  So,  in  this  case,  H.m  should  never  be  mistaken 
for  the  Lebesgue-Stieltjes  (pathwise)  integral  of  H  relative  to  m.  We  know  that 
such  objects  do  not  exist. 

Hence,  until  this  Section,  either  mcM^  or  mfMlocpBV  and  so  the  stochastic 
integral  H.m  was  either  that  of  the  last  Section  or  the  Lebesgue-Stieltjes  stochas¬ 
tic  integral,  respectively.  The  last  definition  considers  meM|0C  so  now  the  possi¬ 
bility  of  an  inconsistency  in  our  definition  of  H.m  arises.  Jacod  shows  that  Y  = 
H.m  as  just  defined  cannot  have  two  distinct  meanings.  He  argues  as  follows:  If 

t 

mf\tlo(.p|BY,  HcL^m),  and  n( t)  =  jH(s)  dm(s)  exists  as  a  Lebesgue-Stieltjes 

o 

integral,  then  nf(IV)|oc.  This  can  be  shown  to  imply  that  ncM]^.  But 
m(Mlofp)BcMioc  so  that  mc  =  0.  But  then,  by  definition. 
Yc  =  0  (Yc  =  H.mc).  So  Y«M|„C  also.  Now  recall  the  “you  know  them  by 
their  jumps’’  description  of  M^,.  given  earlier.  Using  the  facts  that  An  =  HAm 
and,  by  definition.  AY  =  HAm,  we  conclude  that  Y  =  n. 

Remark:  We  have  just  noted  that  the  stochastic  integral  of  this  Section  reduces 
to  the  Lebesgue-Stieltjes  stochastic  integral  when  mfFVloc  so  that  it  is  important 
to  realize  that  contrary  to  the  case  of  the  Lebesgue  Stieltjes  integral.  H.m  is  not 
defined  pathwise;  its  definition  depends  on  the  underlying  probability  and  filtra¬ 
tion. 


no 


6.8.3.  Theorem  (Characterization  of  H.m  on  Mtoc): 

(1)  Let  meMloc  and  HcLUoc(m).  Then  H.m  is  the  unique  ele¬ 
ment  of  Mloc  which  satisfies  [H.m,n]  =  H.[m,nj  for  all 
nfMioc- 

(2)  In  order  that  H.meKp  (respectively  K|pcj  it  is  necessary 
and  sufficient  that  HcLp(m)  (respectively  Lp  \0(.) 

6.8.4.  Remark:  Part  1  echoes  the  characterizations  of  stochastic  integrals  on 
M,'c.  Thus,  the  definition  of  H.m  on  Mloc  is  consistent  with  the  definition  on 
Mj'c.  Part  2  says  that  the  “size”  of  the  integral  is  directly  related  to  the  “size”  of 
the  integrand.  This  result  is  not  surprising  when  the  definition  of  Lp(m)  is 
recalled. 

6.8.5.  Remark:  A  sketch  of  the  proof  that  [H.m,n]  =  H.[m,n]  is  as  follows.  By 
definition  of  [  ,  ],  [H.m,n]  =  <(H.m)c,nc>  -f  ]T(AH.m)An.  By  definition  of 
H.m,  (H.m)c  =  H.mc,  so  that  <(H.m)c,nc>  =  <H.mc,nc>.  The  latter  equals 
[H.mc,nc],  which  by  the  last  characterization  for  continuous  local  martingales 
equals  H.[mc,nc]  =  H.<mc,nc>.  Finally,  since  AH.m  =  HAm,  we  have 
[H.m,n]  =  H.<mc,nc>  +  THAmAn  =  H.(<mc,nc>  +  TAmAn)  = 
H.[m,n].  For  the  converse  we  must  show  that  if  YfMioc  and  [Y.n]=H.[m,n]  for  all 
ncM|oc,  then  Yc  =  (H.m)c  =  H.mc  and  AY  =  HAm.  The  interested  reader 
should  just  write  Y  =  Yc+Yd,  and  m  =  mc+md  and  proceed,  or  see  Jacod. 

6.9.  Stochastic  Integrals  Relative  to  Semi-Martingales 

6.9.1.  For  simplicity,  let  H  be  a  bounded  previsible  process.  Let  XcS  and  have 
the  decompositions  X  =  m-t-A  =  n+  B  with  the  usual  meanings.  By  the  pre¬ 
vious  two  Sections,  the  stochastic  integrals  H.m,  H.n  and  the  Lebesgue  Stieltjes 
integrals  H.A,  H.B  are  well-defined.  Since,  m  -  n  =  B  -  A  is  a  local  martingale  of 
bounded  variation,  we  know  by  the  consistency  of  the  stochastic  and  Lebesgue 
Stieltjes  stochastic  integrals  that  H.(m  -  n)  =  H.(B  -  A).  Therefore,  the  formula 
H.X  =  H.m  -I-  H.A  defines  the  stochastic  integral  of  a  semi-martingale  and  this 
definition  is  independent  of  the  choice  of  the  decomposition.  The  resulting 
expression  H.X  is  called  the  stochastic  integral  of  H  relative  to  X. 

6.9.2.  Remark:  Properties  specifically  derived  or  implied  in  these  Sections  are 
summarized  (with  a  minimum  of  special  notation)  in  the  following  Portmanteau 
Theorem.  The  first  part  of  the  Theorem  contains  a  result  referred  to  in  the 
introduction  concerning  the  extension  of  the  stochastic  integral  beyond  the  class 
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of  semi-martingales.  It  also  includes  a  stochastic  integral  version  of  Lebesgue's 
Dominated  Convergence  Theorem  and  relates  the  elementary  stochastic  integral 
discussed  in  the  Outline  in  Section  6.2.2  to  the  integral  developed  in  this 
Chapter. 

For  additional  details  on  the  construction  of  the  stochastic  integral  the  reader 
should  consult  Dellacherie-Mever  [1982,313],  Jacod  [1979],  and  Dellacherie  [1978]. 

Let  the  space  5  of  elementary  processes  introduce  in  6.2.3.  and  equip  E  with  the 
topology  of  uniform  convergence.  Denote  by  L0  =  L0(F,P)  the  space  of  finite 
measurable  functions  equipped  with  the  topology'  of  convergence  in  probability, 
P. 

6  9.3.  Theorem  (Portmanteau) 

(1)  Let  X  be  fixed  Skorokhod  process  and  H.X  denote  the  elementary  stochastic 
integral  of  H  relative  to  X.  Then  the  mapping  H— ►H.X,  from  E  to  L0,  defined  by 

t 

H(t)  -  /  H(s)  dX(s)  :=  H.X(t)  (*) 

o 

for  each  non-negative  t,  is  continuous  iffX  is  a  semi-martingale. 

(2)  Let  X  be  a  semi-martingale.  The  mapping  from  S  into  L0  defined  by  (*)  can 
be  extended  uniquely  to  the  space  of  all  bounded,  previsible  processes  in  such  a 
way  that  (  retaining  the  notation  H.X  ),  the  mapping  H— ►H.X  is  linear,  the  process 
H.X  is  Skorokhod  and  the  following  properties  hold: 

(a)  (Lebesgue):  If  the  sequence  (Yn)  of  bounded  measurable 
processes  converges  pointwise  to  a  process,  Y,  and  the  Yn  are 
dominated  in  absolute  value  by  a  bounded  previsible  process, 
then  Y  is  a  bounded  previsible  process  and  the  sequence 
Yn  .X  converges  in  probability  to  Y.X  . 

(b)  For  every  bounded,  previsible  H,  H.X  is  a  semi¬ 
martingale.  Also,  if  X  is  a  special  semi-martingale,  then  H.X 
is  a  special  semi-martingale. 

(c)  For  every  H  in  E,  the  (extended)  stochastic  integral,  H.X, 
is  an  elementary  integral. 


142 


(d)  IfX  is  of  bounded  variation  and  H  is  bounded  and  prev- 
isible  then  the  (extended)  stochastic  integral,  H.X,  is  indistin¬ 
guishable  from  the  stochastic  integral,  H.X,  defined  pathwise 

by  H.X(t,w)  =  f  H(s,vv)  dX(s,w).  for  each  wefl,  as  a 

(0,  t) 

Lebesgue-Stieltjes  integral. 

(e)  If  H  and  K  are  bounded  previsible  processes,  then 
K.(H.X)  =  (KH).X,  and  A(H.X)  =  H  A  X. 

(f)  If  T  is  a  stopping  time, 

(H.X)t  =  (H  1([0>T]1.X)  =  (H.(1[[ot)].X))  =  H.XT 

(h)  If  H  is  a  bounded,  previsible  process,  then  H.X  is  a  mar¬ 
tingale,  local  martingale  or  process  of  bounded  variation,  ifX 
is  one  of  these  processes. 

6.10.  Local  Characteristics  of  Semi-Martingales:  In  this  Section  we  will 
only  add  a  few  remarks  to  what  has  already  been  written  with  the  aim  of  show¬ 
ing  some  relationships  between  several  of  these  concepts  and  with  a  portion  of 
the  classical  theory  stochastic  processes  (processes  with  independent  increments). 
Recall  Corollaries  6.5.7  and  6.5.8.  If  XeS  and  we  define 

Yt  :=  Xg  -I-  ^^AXjjlj  |  >1],  (15) 

s<t 


then  X-Y  is  a  semi-martingale  with  bounded  jumps  and  hence  a  special  semi- 
martingale.  Therefore, 

X- Y  =  m  +  a, 

where  m  is  a  local  martingale  with  uniformly  bounded  jumps  and  q  is  a  previsi¬ 
ble  process  of  integrable  variation.  Both  m  and  a  vanish  identically  at  time  zero. 
Decompose  m  into  its  continuous  and  purely  discontinuous  parts, 
m  =  mc  -I-  md,  and  recall  that  Xc  :  =  mc.  It  follows  that 

6.10.1.  Lemma. 

IfXt S,  then  X  can  be  written  in  the  form 

xt  =  Xfl  +  +  Xtc  +  Yt  +  mtd  ( 16) 

and  this  representation  is  unique. 
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Remark:  Let  /z  he  the  saltus  measure  of  X  (Chapter  4): 

/((w.dt.dz)  =  l|AX,^o|f(?,AX,)(^t'dz),  ( 1  < ) 

and  v  be  the  dual  previsible  projection  of  the  random  measure,  /z.  Setting 
3  =  <X  ,XC>,  the  triple  (a.J.z/)  is  called  the  triple  of  P-local  characteristics 
of  the  semi-martingale  X.  This  triple  is  uniquely  determined  by  the  semi- 
martingale  X,  to  within  a  P-null  set.  But  while  3  and  v  are  intrinsic  characteris¬ 
tics  of  X,  the  component  a  depends  on  the  “truncation  point"  in  the  definition  of 
Y  in  (15).  Therefore,  the  triple  does  not  characterize  the  semi-martingale  X. 

In  Chapter  4  integration  relative  to  a  random  measure  was  taken  in  the  sense  of 
a  Lebesgue-Stieltjes  integral.  But  we  also  noticed  there  that. 
/z((0,t]xB)  -  i/((0,t]xB)  is  a  local  martingale,  for  each  Bf£,  t>0.  In  fact,  it  is  a 
purely  discontinuous  local  martingale.  So,  if  we  want  to  integrate  relative  to 
//-za  we  need  to  at  least  recognize  the  fact  that  yet  another  stochastic  integral  is 
required.  We  will  not  go  into  the  construction  of  this  type  of  stochastic  integral, 
but  recommend  Part  I  of  the  1978  paper  by  Kabanov,  Liptser  and  Shirvayev  in 
the  Sbornik  or  Jacod[l979,p.96], 

With  the  aid  of  this  stochastic  integral  which  we  will  denote  by  /  d (p-i/)  and 

with  md  as  in  (16),  Kabanov  et  al  show  that 

t 

md  =  /  /  xd(j/~i/).  (18) 

0  {x:  |  x  |  <1} 

We  can  also  write  the  process  Y  in  (15)  (as  a  path  integral)  in  terms  of  the  saltus 
measure  /z  of  X: 

t 

Yt  -  Xq  =  /  /  x//( ds,dx).  (19) 

o  {x:  |  x  |  >1} 

Thus,  we  can  state  the  following 
6.10.2.  Theorem: 

//XfS,  with  saltus  measure  /z  and  local  characteristics  [a,3,v),  then 

t  t 

Xt  =  Xo  +  o i  +  3t  +  f  f  x/z(ds,dx)  +  f  /  xd(/z  -i').  (20) 

0  (x:  |  *  |  >  1 }  0{x:|x|<l) 

and  this  representation  is  a.s.P  unique. 


This  representation  of  a  semi-martingale  allows  one  to  relate  semi-martingales  to 
P.  Lew's  remarkable  theorv  of  processes  with  independent  increments 


(Levy[  1937]  and  Loeve[1960]). 


6.10.3.  Definition:  A  process  with  independent  increments  (II)  on  a  filtered 
probability  space  (H.H.F.P)  is  a  Skorokhod  process  X  adapted  to  F  such  that  for 
each  pair  (s,t)  with  0<s<t<oo  the  random  variable  Xt  -  Xj  is  probabilistically 
independent  of  Fs.  Further,  a  process  X  with  II  is  said  to  be  a  process  with  sta¬ 
tionary  independent  increments  (SO)  if  Xq  =  0  and  Xt  -  Xj  has  the  same 
distribution  as  X t_s,  0<s<t<oo. 

Remark:  The  most  famous  examples  of  processes  with  stationary  independent 
increments  are  Brownian  motion  and  the  Poisson  process.  Standard  Brownian 
motion,  also  called  the  standard  Wiener  process,  is  a  process  B  with  the  pro¬ 
perties  that  B  is  F  adapted  and  for  each  pair  of  numbers  (s,t),  0<s<t<oc,  the 
random  variable  Bt  -  B3  has  a  normal  distribution  with  zero  expectation,  variance 
t  -  s  and  is  independent  of  Fs. 

Since 

E(B,2  -  B,2)  |  F.)  =  E(Bt  -  B,)2  |  F,)  =  E(Bt  -  B,|2  =  t-s, 

it  follows  that  <B,B>t  =  t,  t>0.  Notice  that  this  also  shows  that 
(B2  -  t,t>0)  is  an  F-martingale.  B  is  obviously  an  F-martingale  also  and  it  can 
be  shown  that  P-almost  all  of  its  paths  are  continuous.  So  the  paths  of  Brownian 
motion  are  of  unbounded  variation  with  probability  one. 

Any  reference  to  a  Brownian  motion  process  will  mean  a  process  X  such  that 
Xt  =  mt  +  <rBt,  where  m  is  any  real  number,  <7>0,  and  B  is  standard  Brownian 
motion. 

Not  only  is  B  a  process  with  stationary  independent  increments,  but  if  X  is  any 
SII  process  which  is  a.s.P  continuous  then  X  is  Brownian  motion  (  X  =  mt+crB  ). 
That  is,  every  a.s.P  continuous  process  with  stationary  independent  increments  is 
a  Brownian  motion  process. 

Since  standard  Brownian  motion  is  a  martingale  it  follows  that  every  Brownian 
motion  process  is  a  semi-martingale.  Poisson  processes  are  submartingales,  so 
they  are  also  semi-martingales  by  the  Doob-Meyer  Decomposition  Theorem.  But 
not  every  process  with  independent  increments  is  a  semi-martingale.  Jacod[1979] 
shows  that  a  process  with  independent  increments  is  a  semi-martingale  ifT  the 
function  t— ►Ee"jX*,  u,  t  real,  has  finite  variation  on  compact  sets. 

Remark:  At  this  point  it  might  be  of  some  interest  to  readers  of  this  note  to 


glance  back  at  Loeve’s  1960  book  on  probability.  Specifically,  refer  to  Section  22 
where  the  classical  Central  Limit  Problem  is  defined  and  recall  the  role  played  bv 
"infinitely  divisible"  random  variables  in  the  solution  of  this  problem.  Then  turn 
to  Section  37  and  look  at  the  definition  of  a  “decomposable”  random  function 
(stochastic  process);  this  is  a  process  with  independent  increments.  Some  of  the 
principal  results  there  indicate  the  beginnings  of  the  modern  theory  of  semi- 
martingales  and  random  measures. 

Remark:  Jacod  (Jacod[1979,  90-95])  shows  that  semi-martingales  which  are 
processes  with  independent  increments  have  deterministic  local  characteristics 
(i.e.,  there  exists  a  version  of  the  triple  (at,i3,v)  that  does  not  depend  on  wtfl)  and 
conversely  only  semi-martingales  with  II  have  this  property.  When  the  local 
characteristics  have  the  additional  property  that  a  and  3  are  linear  in  t  and  v  is 
a  particular  product  measure  on  (0,oo)xR,  then  these  processes  are  also  station¬ 
ary.  This  provides  a  useful  link  between  the  classical  and  modern  theories  of  sto¬ 
chastic  processes. 

6.11.  Ito’s  Formula  and  Applications  to  Brownian  Motion:  We  will  limit 
our  discussion  of  Ito’s  formula  to  processes  with  continuous  paths.  Stochastic 
integrals  relative  to  this  type  of  process  are  the  most  studied  because  of  their 
close  connection  to  Brownian  motion  and  stochastic  differential  equations. 


6.11.1.  Remark:  Let  the  function  K:R— R  have  continuous  second  order  deriva¬ 
tives.  Let  m  be  a  continuous  function  on  R+.  Then  using  a  finite  Taylor  series 
expansion  applied  to  the  increments  of  K,  we  have 


n  l 


K(mt)  -  K(m0)  =  S  (K(mtJ  -  K(mtk J) 

k«=0 


n-t 


1 


n-1 


=  SR  (mtk JAmtl  +  (mtk  JfAmJ-  +  rn 

(E.g.,  tk  =  t,|n)  =  tk/n,  so  that  0  =  t0  <  tj  <  ...<  tn  =  t.) 


If  m  is  of  finite  variation  (in  addition  to  being  continuous),  the  remainder  rn(2)  and 

n-l  o 

S  (Amt  )2  converge  to  zero  and  we  have  the  usual  change  of  variable  formula  for 
k-o 

Stieltjes  integrals: 

t 

K(mt)  -  K(m0)  =  /K'  (ms)dms 
o 


or  symbolically  dK(mt)  =  K  (mt)dmt. 


Now,  if  we  replace  the  function  mt  by  a  standard  Brownian  motion,  B.  a  continu¬ 
ous  process  of  unbounded  variation,  it  can  be  shown  that 

V](ABtk)2—  <B,B>t  =  t,  a.s.P,  (-) 

k=o 

and  r,]2*  —►  0  as  n  — ♦  oc.  Further,  both  the  sequence  of  sums  and  the 

remainders  rnu  converge  to  zero  for  each  Therefore  we  would  expect  the 

change  of  variable  formula  for  stochastic  integrals  with  Brownian  integrators  to 
be  of  the  form 

t  t 

K(Bt)  -  K(B0)  =  /K'(BJdBs+i/K"(Bs)ds. 

0  “0 

This  is  Ito’s  original  formula  for  Brownian  motion.  When  B  is  replaced  by  con¬ 
tinuous  local  martingale  M,  equation  (*)  continues  to  hold  but  the  limit  is  the 
process  the  compensator  of  the  submartingale  M2.  We  will  show  below 

that  the  process  <M,M>  is  distinguishable  from  <B,B>  unless  M=B.  So  for 
any  continuous  local  martingale  one  would  expect  that  the  change  of  variables 
formula  for  stochastic  integrals  would  become 

t  i  1 

K(Mt)  -  K(M0)  =  JK'  (Ms)dMs  +  -IJk'  '  (Ms)d<M,M>, 

o  £  o 

This  is  the  claim  of  the  next  Theorem. 

Let  (fi,H,F,P)  be  a  filtered  probability  space.  Take  m  to  be  a  continuous  local 
martingale  and  recall  that  M£c  C  K]2,.,  so  that  <m,m>  exists.  We  will  say  that 
X  is  a  continuous  semi-martingale  if  X  =  m  +  A,  with  m  as  specified  above 
and  A  a  continuous  process  of  bounded  variation  on  finite  intervals.  Then  the 
following  form  of  Ito‘s  change  of  variables  formula  holds  (Kunita, 
Watanabe[l967];  Meyer[l976]): 

6.11.2.  Theorem: 

Let  X  be  a  continuous  semi-martingale  and  K  be  a  function  mapping  R-+R  and 
having  continuous  second  derivatives  on  R.  Then  the  process  Y=K(X)  is  a 
semi-martingale  and  (up  to  indistinguishability) 

t  t 

K(Xt)  -  K(Xo)  =  f  K'  (X,)  dX3  +  K"  (X*)  d<X,X>,  (21) 

o  z  o 


Th  is  change  of  variables  formula  is  often  written  in  the  purely  symbolic  form  of 


“differentials'':  dK(X)  =  K  (X}dX  +  -^-K  '  d<X>,  but  this  only  has  meaning 
in  terms  of  the  integral  equation  in  (21). 

Remark:  Although  we  will  consider  only  continuous  processes  in  this  Section,  it 
is  informative  to  see  how  theorem  changes  in  the  case  of  an  arbitrary  semi- 
martingale: 

t  t 

K(Xt)  -  K(Xo)  =  /  K'  (X,J  dX,  +  U  K"  (X,_)  d<X',X'>3  (21*) 

o  £  o 

+  £  (K(X.)  -  K(XJ  -  K'  (Xj.JAXg). 

0<s<t 

Remark:  When  the  semi-martingale  is  purely  discontinuous  and  of  bounded  vari¬ 
ation,  it  is  clear  from  the  application  of  Taylor’s  Theorem  above  that  a  change  of 
variable  formula  should  only  involve  the  first  derivative  of  K.  Ito’s  formula,  as 
given  in  the  last  equation,  verifies  and  extends  this  to  show  that  in  the  case  of  an 
arbitrary  purely  discontinuous  semi-martingale,  the  formula  also  involves  only 
the  first  order  derivative  of  K. 

6.11.3.  Remark:  The  Theorem  immediately  extends  to  vector  valued  continuous, 
semi-martingales  (a  finite  dimensional  vector  whose  components  are  continuous 
continuous  semi-martingales):  X  =  (X^X2,  •  •  •  .X11  ).  Let  K  be  a  function  from 
Rn  to  R  having  continuous  second  order  partial  derivatives.  Let  D'K  denote  the 
first  order  derivative  of  K  relative  to  its  ith  component  with  the  obvious  meaning 
for  D'JK,  the  formula  takes  the  form 

t  t 

K(X,)-K(Xo)  =  E/IDKHX,)  dX,‘  +  iE/(D«K|(X,|  d<X’>'  >,.  (22) 

i  0  i,j  0 

6.11.4.  Remark:  When  K(u)=u2  in  (21),  and  meM|'c,  Ito’s  formula  gives 

t 

mt2  -  m02  =  2/m,  dms  +  <m,m>l. 
o 

When  K(u,v)  =  uv  and  we  use  (22)  with  m,ncM|'c,  we  obtain 

t  t 

mtnt  -  m0n0  =  /m3dns  +  Jnsdms  +  <m,n>s.  (23) 

o  o 
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This  integration  by  parts  formula  is  of  course  the  continuous  parameter  (continu¬ 
ous  process)  analogue  of  the  one  in  Chapter  1.  It  can  be  extended  to  general 
semi-martingales. 

Remark:  It  is  useful  to  allow  the  map  K  in  Ito’s  Theorem  to  be  a  complex-valued 
function.  For  this  purpose,  the  expectation,  conditional  expectation,  and  so  on, 
of  complex  valued  processes  are  defined  in  terms  of  their  real  counterparts  via  the 
real  and  imaginary  parts  of  the  process.  For  instance,  a  complex  valued  mar¬ 
tingale  is  one  whose  real  and  imaginary  parts  are  martingales. 

Remark:  The  following  is  the  canonical  first  application  of  Ito's  Theorem.  It  is 
due  to  P.  Levy.  As  proved  in  Doob[1953],  it  assumes  that  X  is  a  continuous  mar¬ 
tingale  with  the  property  that  (Xt2  -  t)  is  a  martingale.  The  statement  and  proof 
given  here  is  due  to  Kunita  and  Watanabe[l967].  It  uses  their  extension  of  Ito’s 
formula  and  assumes  for  the  proof  of  Levy's  Theorem  only  that  X  is  a  continuous 
local  martingale  satisfying  the  condition  that  <X,X>t  =  t.  When  X  is  a  mar¬ 
tingale,  this  latter  condition  is  equivalent  to  the  requirement  that  (Xt2  -  t)  is  a 
martingale  as  in  Doob’s  statement  of  Levy’s  Theorem.  Our  presentation  of  the 
Kunita-Watanabe  proof  is  due  in  part  to  Chung  and  Williams[1983], 

6.11.5.  Theorem: 

X  is  standard  Brownian  motion  relative  to  the  filtration  F  if,  and  only  if, 
XeM0C|oc(F)  and  <X,X>t  =  t,  t>0. 

Remark:  The  condition  is  necessary  by  a  remark  in  the  last  Section.  In  order  to 
prove  that  the  condition  is  sufficient,  define  Ku  on  R  by  Ku(x)  ==  emx,  for  each  u 
in  R  and  apply  the  Ito  formula.  From  (21),  since  X^  =  0,  we  obtain 

t  t 

KJX.)  -  I  =  /K,;  (X,)dX,  -  I/k;  ’  (X,)ds. 

0  i  0 

That  is, 

t  2  t 

eiuX»  _  i  =  iu  J e^dX,  -  —  / eiuX*ds.  (24) 

o  2  0 

The  second  integral  on  the  right  of  (24)  results  from  (21)  and  <X,X>S  =  s.  The 
first  integral  on  the  right  of  (24)  is  a  martingale,  because  its  integrand  is  a 
bounded,  previsible  process  and  the  stopped  process  Xt  is  a  martingale  for  any  t. 

t+s 

Therefore,  E(  J  e'^dXy  |  Fs)  =  0,  for  s,t>0.  Then,  from  the  definition  of  condi- 

3 

t  +  S 

tional  expectation,  if  BcFs,  E(1B  j  e'^’dXy)  =  0.  It  follows  that 


E(  lB(eiuX***  -  eiuXl)) 


O  t+s 

=  — Els/e^dv 

“  s 

o  t  +  S 

=  --^/E(lBeiuX*)dv,  (25) 

“  S 

with  an  application  of  Fubini’s  Theorem.  If  we  define  gs(t)  :=  E(lBeluX,~)  equa¬ 
tion  (25)  becomes 

2  * 

gs(l)  -  Ss(0)  =  -4-/gs(v)dv- 
£  0 

It  can  be  shown  from  this  equation  that  gs  must  satisfy  gs(t)  =  gs(0)e  “ 

Therefore,  again  using  the  definition  of  c.exp.  and  the  fact  that  gs(0)  is  Fs- 
measurable,  we  obtain 

_ult 

E|e>u(X,^-X,)  |  Fj)  =  e  2  f  (26) 

and  so  if  Y  is  an  arbitrary  bounded  Fs-measurable  random  variable, 

uit 

E(Ye'u,X‘-'Xs)  |Fs)  =  Ye  2  .  (27) 

Hence, 

.ilt 

E(Yeiu(X*+*‘X,))  =  (EY)  e  2  . 

But  from  (26),  this  is  the  same  as 

E(Y  eiu(X,~  "  ^])  =  (EY)  E(eiu(X,~  '  ^ 

It  follows  that  the  random  variables  em(x,-rt~xd  and  Y  are  independent.  Hence. 

(Xt+S  -  XJ  is  independent  of  Fs.  Again,  by  (26),  Efe111*^  ”  =  e  2  ,  the 

characteristic  function  of  a  Normal  zero  mean  random  variable  with  variance  t. 
Therefore,  X  is  standard  Brownian  motion. 

6.11.6.  Remark:  We  have  already  noticed  in  the  previous  section  that  Brownian 
motion  is  the  only  continuous  process  with  stationary  independent  increments. 
The  following  observation  is  a  much  stronger  indication  of  the  importance  of 
Brownian  motion  in  the  General  Theory  of  Stochastic  Processes.  It  says  that  a 
large  class  of  continuous  local  martingales  are  but  “a  time  change  away  from 
being  Brownian  motion  ". 

Let  M  be  a  continuous  local  martingale  and  suppose  that  <M,M>  x  = 
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Then,  if  we  define 


>h  =  inf{s:<M.M>s  >  t}. 

the  process  3  defined  by  setting  ( 3S  :=  Nft;  ,s>0)  can  be  shown  to  be  an  ( Ff;>)- 
Brownian  motion  process  and  Mt  =  (Dubins,  Schwartz[l965].) 

Remark:  We  now  give  a  very  simple  application  of  Ito’s  formula  that  will  be 
extended  to  vector  valued  processes  later.  Let  K:R— *R  have  two  continuous 
derivatives  and  introduce  the  differential  operator,  L,  by  setting 

LK  :=  mK'  +  -<t2K'  '  . 

2 


Let  X  be  a  Brownian  motion  process: 

Xt  =  mt  +  ctB, 

where  mcR,  t>0,  <r>0  and  B  is  standard  Brownian  motion.  Then,  in  differential 
form,  Ito’s  formula  gives 

dK(X)  =  K'  (X)dX  +  i-K'  '  (X)d<X,X> 

2 

==  K'  (X)(mdt  +  <rdB)  +  -K'  '  (X)<r2dt 

2 

=  <tK'  (X)dB  +  (mK'  (X)  +  \<t2y1  '  (X))dt 


Therefore, 

dK(X)  =  trK'  (X)dB  +  (LK)(X)dt. 


Since  K'  is  continuous,  and  so  previsible,  and  B  is  a  martingale,  it  follows  that 

t  t 

K(Xt)-K(Xo)  -  / (LK)(Xs)ds  =  J<rK'  (XJdBs. 
o  o 

is  a  martingale. 


6.11.7.  Remark:  We  conclude  this  Section  and  the  Chapter  with  a  brief  look  at 
stochastic  differential  equations.  To  be  consistent  with  the  generality  of  the 
stochastic  integral  introduced  in  this  Chapter,  we  will  start  with  the  development 
of  C.  Doleans-Dade  [1976],  However,  our  main  intent  is  to  introduce  stochastic 
differential  equations  "driven"  by  Brownian  motion  processes.  Ito  diffusions,  and 
relate  these  to  A.,\.  Kolmogorov’s  original  description  of  a  diffusion.  We  will  use 
the  Ito  formula  and  a  generalization  of  the  operator  L  defined  in  the  last  Remark 


to  very  briefly  describe  the  connection  with  the  Stroock-Yarahdan  theory  [1979] 

Let  (Q.H.F.P)  be  a  filtered  probability  space  with  the  filtration  F  satisfying  the 
"usual  conditions”.  Suppose  that  a  and  b  are  two  functions  mapping  R+XfixR 
into  R.  which  are  left  continuous  with  right  limits  in  the  first  factor,  F-adapted 
relative  to  the  second  and  satisfy  the  following  uniform  Lipschitz  condition: 

|  b(s,w,x)-b(s,w,y)  |  +  |  <7(s,w,x)-<r(s,w,y)  |  <  K | x-y  |  (L) 

for  some  constant  K  and  all  (s,w,x),  (s.w.y)  in  the  domains  of  a  and  b.  C. 
Doleans-Dade  [1976]  proves  the  following: 

Theorem: 

If  M  is  an  F -local  martingale  and  A  is  a  process  in  BV  and  a  and  b  satisfy  the 
conditions  stated  above,  then  there  exists  one  and  only  one  adapted  Skorokhod 
process  X  satisfying  the  stochastic  integral  equation 

t  t 

Xt  =  Xo  +  /rr(s.XsJdMs  +  /b(s,X,_)dA,. 
o  o 


Remark:  As  pointed  out  by  Doleans-Dade,  the  uniform  Lipschitz  condition  in  x 
implies  that  the  mappings  (w,x)— *<x(t,w,x)  and  (w,x)— *b(t,w,x)  are  FtXB(R)- 
measurable.  Consequently,  the  functions  w— ►<r(t,w,Xs_)  and  w— ►b(t.w,Xs_)  are 
F-adapted,  if  we  assume  that  the  process  X  is  Skorokhod  and  adapted.  By  the 
assumed  left  continuity  and  existence  of  right  limits  for  cr  and  b.  we  have  there¬ 
fore  that  the  processes  (<r(t,X^).t>0)  and  (b(t,Xu),t>0)  are  adapted,  left  con¬ 
tinuous  and  have  right  limits.  Hence,  these  processes  are  F-previsible  and  locally 
bounded.  Therefore,  if  X  is  any  adapted  Skorokhod  process  the  integrals  on  the 
right  side  of  the  equation  in  the  Theorem  exist  bv  earlier  results  in  this  Chapter. 

This  Theorem  can  be  extended  in  several  ways.  One  is  that  it  can  be  restated  for 
M  as  a  d-dimensional  vector  valued  process,  with  a  a  matrix  valued  function  of 
order  (n,d)  and  b  a  vector  valued  process  with  values  in  Rn.  The  condition  (L) 
can  be  modified  in  an  obvious  way  and  if  we  agree  that  vector  valued  processes 
are  adapted,  Skorokhod,  etc  when  their  components  have  these  properties,  an 
existence  and  uniqueness  Theorem  analogous  to  the  one  above  continues  to  apply. 
We  will  consider  this  type  of  structure  in  paragraph  6.11.8  below,  with  the  com¬ 
ponents  of  M  being  independent  Brownian  motion  processes. 

A  very  interesting  paper  that  we  mentioned  in  the  first  Chapter  (Doleans-Dade 
[1970])  treats  a  special  case  of  the  stochastic  integral  given  above.  Suppose  that 


vj 


cr( t , w,x )  =  b(t,w,x)  =  x,  then  this  stochastic  integral  takes  the  form 

t 

x,  =  x<,  +  jXdz„ 

0 

where  Z  is  a  semi-martingale.  In  her  1970  paper,  C.  Doleans-Dade  finds  the  expli¬ 
cit  solution  to  this  equation.  It  is  called  the  exponential  of  Z  when  Xq  =  1  and 
is  given  by 

Xt  =  exp(Zt  -  i<Z',Z‘>t)n(l  +  AZs)e  AZ». 

“  s  <  t 

The  proof  that  this  process  satisfies  the  previous  stochastic  integral  equation  is  a 
simple  application  of  the  general  Ito  formula.  If  we  set  e(Z)  =  X  in  the  last 
equation,  a  two  line  application  of  integration  by  parts  to  evaluate  the  product 
e(Y)efZ)  for  Y.ZcS  yields 

MZ)  =  eO  +  Z  +  [\  , Z] ) 

and  not  the  expected  e(Y+Z).  The  expected  happens,  of  course,  when  [Y,Z]  =  0. 
F‘>r  example,  this  occurs  when  Y  and  Z  are  counting  processes  representing  the 
number  of  arrivals  and  departures  (respectively)  at  a  particular  queueing  station 
when  arrivals  and  departures  from  the  queue  never  occur  at  the  same  time. 

6.11.8.  Remark:  Now,  we  will  specialize  the  local  martingale  in  the  previous 
Theorem  to  Brownian  motion,  the  integral  relative  to  the  process  A  to  an  integral 
relative  to  Lebesgue  measure,  allow  the  processes  <7  and  b  to  depend  on  t  only- 
through  Xt  and,  in  the  other  direction,  consider  multi-dimensional  processes. 
Thus,  let  B  =  (Bt)  be  a  d-dimensional  F-Brownian  motion  process.  That  is, 
Bt  =  ( Btl ,  •  •  •  ,Btd),  where  the  B1  are  P-independent  F-Brownian  motion 
processes;  so  in  particular  the  distribution  of  Bt-Bs  ( t > s )  is  normal  (0,(t-s)I), 
where  I  =  dxd  is  the  identity  matrix. 

Thus,  cr:Rn  -♦  R"xRd,  b:Rn  — ►  Rn  and  X  satisfies  the  equation 

t  t 

Xt  =  Xq  +  /  odXjdB..  +  /b(XJds.  (28) 

o  o 

With  Xo  =  x,  the  process  X  is  called  an  Ito  diffusion,  and  is  said  to  satisfy  the 
stochastic  differential  equation 

dXt  =  <r(Xt)dBt  +  b(Xt)dt. 

for  t>0.  and  Xq  =  x.  X  =  (Xt)  is  then  a  strong  Markov  process  with  a.s.P  con¬ 
tinuous  paths.  (For  an  easy  to  read  account  on  stochastic  differential  equations 
see  Oksendal  [1986],  in  particular,  his  Theorem  5.5  for  an  existence  and  unique¬ 
ness  result  that  covers  this  case.) 


The  kth  component,  Xk,  of  equation  (28)  is  given  in  differential  form  by 

d 

dXk  =  V>kjdB»  +  bkdt. 
j=»i 

An  application  of  (22)  to  (28)  yields 

dK(X)  =  V  VID'jKHX^jdBj  +  (29) 

i=ij=i 

+  VlD^KXJbidt  +  i-V)DikK)(X)aikdt 
i  "  i.k 

where  a  =  cr<x*,  <x*  the  transpose  of  a. 

The  extension  of  the  previously  defined  operator  L  to  functions  K  on  Rn  is 

(LK)(x)  =  V](DiK)(x)bi  +  i-V^DiiKKxJaij. 
i  “  i.j 

Then  from  (29)  we  see  that 

t 

CtK(X)  =  K(Xt)  -  K(x)  -  /(LK)(Xs)ds  (30) 

o 

is  a  martingale,  as  in  the  one  dimensional  case  treated  earlier. 

Thus,  starting  with  Brownian  motion  on  a  given  filtered  probability  space,  and 
an  Ito  process  X  satisfying  (28),  we  associated  the  operator  L  with  the  property 
that  CK(X)  was  a  martingale  for  a  large  class  of  functions,  K,  defined  on  Rn. 

There  is  a  “converse”  to  this  result  due  to  Stroock  and  Varahdan  [1979]  which  is 
extremely  important  in  the  study  of  vector  valued  diffusions  and,  further,  can  be 
used  to  define  diffusions  on  more  general  manifolds  than  Rn.  We  can  not  say 
much  about  the  Stroock-Varahdan  approach  in  this  note,  but  highly  recommend 
the  paper  by  D.  Williams  [1981]  for  an  introduction  to  this  subject  and  its  rela¬ 
tionship  to  the  Ito  method. 

Roughly  speaking,  view  a  process,  X,  as  a  member  of  the  space,  W,  of  continuous 
functions  from  R+  to  Rn.  Take  A*  to  be  the  cr-algebra  of  subsets  of  W  generated 
by  {Xg,s<t}  and  set  A*  =  A^.  Let  xtRn  and  L  be  an  operator  of  the  form 

(LK)(x)  =  t(D‘K)(x)bi  +  |S(DijK)(x)aij, 
i  Z  i.j 

where  the  matrix  valued  function  “a”  and  the  vector  valued  function  b  are 
defined  on  Rn. 


Suppose  that  Px  is  a  probability  measure  on  (\V,A  ),  with  the  property  that 
Px(X0  =  x)  =  1  and  C'K,  defined  by  (30),  is  an  (\V.A*.(At*).Px)-martingale,  for  all 
twice  differentiable  functions  K  on  Rn  having  compact  support.  (Then  Px  is  said 
to  solve  the  martingale  problem  for  L  starting  from  x.) 

Finally,  if  “a"  can  be  written  in  the  form  a-  =  (<T<r*)jj,  then  X  is  continuous 
and  there  exists  a  Brownian  motion  process  B  such  that  X,,  is  independent  of 
Bt  -  Bs  and 

t  t 

Xt  =  x  4-  Jb(X,)ds  +  JV(Xs)dBs. 

0  0 


BIBLIOGRAPHY 


Aldous,  D.J.  (1981).  Weak  Convergence  and  the  General  Theory  of  Processes. 

Draft  Monograph  Dept  of  Statistics,  Univ  of  Calif,  Berkeley,  C'A. 

Andersen,  G.  (1986).  An  Application  of  Discrete  Parameter  Martingale  Calculus 
to  Discrete  Point  Processes.  Part  I:  Girsanov  transformations,  and 
compensator  robustness.  In  draft. 

Andersen,  G.  (1986).  An  Application  of  Discrete  Parameter  Martingale  Calculus 
to  Discrete  Point  Processes.  Part  II:  Nonlinear  Filtering.  In  draft. 

Bichteler,  K.  (1981).  Stochastic  Integration  and  Lp-Theory  of  Semimartingales. 
The  Annals  of  Probability  Vol.9  No. 21  49-89. 

Billingsley,  P.  (1968).  Convergence  of  Probability  Measures.  John  Wiley  V  Sons. 

Boel,  R.,  Varaiya  P.  and  Wong,  E.  (1975).  Martingales  on  Jump  Processes.  I: 
Representation  Results.  SIAM  J  Control  Vol  13  No. 5,  999-1021. 


Boel,  R.,  Varaiya  P.  and  Wong,  E.  (1975).  Martingales  on  Jump  Processes.  II: 
Applications.  SIAM  J  Control  Y ol  13  No. 5,  1022-1061. 

Bremaud,  P.  (1975).  An  Extension  of  Watanabe's  Theorem  of  Characterization  of 
Poisson  Processes  over  the  Positive  Real  Half  Line.  J.  Appl.  Prob. 
12  396-399. 

Bremaud,  P.  (1975).  The  Martingale  Theory  of  Point  Processes  over  the  Real  Half 
Line  Admitting  an  Intensity.  Led.  Notes  in  Economics  and  Math. 
Systems  (Control  Theory)  107  Springer-Verlag. 

Bremaud,  P.  (1976).  La  methode  des  semi-martingales  en  filtrage  lorsque 
l’observation  est  un  processus  ponctel.  Seminaire  Proba.  X. 
Universit'e  de  Strasbourg,  Lect.  Notes  in  Math.  511,  Springer- 
Verlag,  1-18. 

Bremaud,  P.  (1978).  On  the  Output  Theorem  of  Queueing  Theory  via  Filtering. 
J.  Appl.  Prob.  15  397-405. 


157 


Bremaud,  P.  (1981).  Point  Processes  and  Queues  Martingale  Dynamics. 
Springer-Verlag  Series  in  Statistics. 

Bremaud,  P.  and  Jacod,  J.  (1977).  Processus  Ponctuels  et  Martingales:  Resultats 
Recents  sur  la  Modelisation  et  le  Filtrage.  Adv.  Appl.  Prob.  9  362- 
416. 

Bremaud,  P.  and  Yor,  M.  (1978).  Changes  of  Filtrations  and  of  Probability 
Measures.  Z.  Wahr.  verw.  Gebiete  45  269-295. 

Brown,  T.C.  (1978).  A  martingale  approach  to  the  Poisson  convergence  of  simple 
point  processes.  The  Annals  of  Probability  Vol.  6  726-744. 

Brown.  T.C.  (1983).  Some  Poisson  Approximations  Using  Compensators.  The 
Annals  of  Probability  Vol.  11  No. 3  726-744. 

Chou,  C.S.  and  Meyer,  P.A.  (1975).  Sur  la  Representation  des  Martingales 
C’omme  Integrales  Stochastiques  Dans  les  Processus  Ponctuels. 
Seminaire  Proba.  EX,  Universit'e  de  Strasbourg ,  Lect.  Notes  in 
Math.  465,  Springer-Verlag,  226-236. 

Chung,  K.L.  (1982).  Lectures  from  Markov  Processes  to  Brownian  Motion. 
Springer-Verlag. 

Chung,  K.L.  and  Doob,  J.L  (1965).  Fields  optionality  and  measurability.  Amer.  J. 
Math.  397-424. 


Chung,  K.L.  and  Williams,  R.J.  (1983).  Introduction  to  Stochastic  Integration. 
Birkhauser. 

Davis,  M.  (1976).  The  Representation  of  Martingales  of  Jump  Processes.  SIAM  J 
Control  and  Opt.  Vol  14. 

Dellacherie,  C.  (1970).  Un  Exemple  de  la  Theorie  Generale  des  Processus. 

Seminaire  Proba.  IV,  Universit'e  de  Strasbourg,  Lect.  Notes  in 
Math.  124,  Springer-Verlag,  60-70. 

Dellacherie,  C.  (1972).  Capacites  et  processus  stochastiques.  Springer-Verlag. 

Dellacherie,  C.  (1978).  Un  survol  de  la  theorie  de  l’integrale  stochastique.  Proc.  of 
the  International  Congress  of  Mathematicians  Helsinki  2  733. 


Dellacherie,  C.  and  Meyer,  P.-A.  (1975).  Probabilities  and  Potential.  North 
Holland  Mathematical  Studies  No.  29. 

Doleans-Dade,  (1970).  Quelques  applications  de  la  formule  de  changement  de 
variables  pout  les  semi-martingales.  Z.  Wahr.  verw.  Giebiete  16 

181-194. 

Doleans-Dade,  C.  (1976).  On  the  Existence  and  Unicity  of  Solutions  of  Stochastic 
Integral  Equations.  Z.  Wahr.  verw.  Giebiete  36  93-101. 

Dellacherie.  C'.  and  Meyer,  P.-A.  (1980).  Probabilities  and  Potential  D  Theory  of 
Martingales.  North  Holland  Mathematical  Studies  No.  72. 

Doleans-Dade,  C.  and  Meyer,  P.A.  (1970).  Integrals  Stochastiques  Par  Rapport 
Aux  Martingales  Locales.  S'eminaire  Proba.  IV.  Universite  de 
Strasbourg,  Lect.  Notes  in  Math.  124,  Springer- Verlag,  77-107 

Doob,  J.L.  (1953).  Stochastic  Processes.  John  Wiley  &  Sons. 

Dubins,  L.,  Schwarz,  G.  (1965).  On  Continuous  Martingales.  Proc.  Sat.  Arad 
Sci.  U.S.A.  53. 

Gill,  R.D.  (1980).  Censoring  and  Stochastic  Integrals.  Mathematical  Centre 
Tracts  124. 

Helland,  I.  (1981).  Central  Limit  Theorems  for  Martingales.  Scand.  J.  Statistics  9 


Ikeda,  N.  and  Watanabe,  S.  (1981).  Stochastic  Differential  Equations  and 
Diffusion  Processes.  North-Holland  Publishing  Company. 


Itmi,  M.  (1980).  Histoire  Interne  des  processus  ponctuels  marques  stochastiques. 

Etude  d  un  probleme  de  filtrage.  These  de  3eme  cycle  U.  de  Rouen. 


Ito,  K.  (1944).  Stochastic  Integral.  Proc.  Imp.  Acad.  Tokyo.  20. 


Ito,  K.  (1951).  Multiple  Wiener  Integral.  J.  Math.  Soc.  Japan  3. 


Jacobsen,  M  (1982).  Statistical  Analysis  of  Counting  Processes.  Lect.  Notes  in 
Statistics,  No.  12,  Springer-Verlag. 


159 


Jacod,  J.  (1975).  Multivariate  Point  Processes:  Predictable  Projection  Radon- 
Nikodym  Derivatives  Representation  of  Martingales.  Z.  Wahr.  verw. 
Giebiete  32  235-253. 

Jacod,  J.  (1976).  Un  theoreme  de  representation  pour  les  martingales 
discontinues.  Z.  Wahr.  verw.  Giebiete  34  225-244. 

Jacod,  J.  (1979).  Calcul  Stochastique  et  Problemes  de  Martingales.  Lect.  Notes  in 
Math.  714  Springer-Verlag. 

Kabanov,  J.,  Lipcer,  R.,  Sirjaev,  A.  (1979).  Absolute  Continuity  and  Singularity 
of  Locally  Absolutely  Continuous  Probability  Distributions.  I.  Math 
USSR  Sbornik  Vol.  35,  No  5,  631-680. 

Kabanov,  Y.M.  and  Liptser,  R.S.  and  Shiryaev,  A.N.  (1983).  Weak  and  Strong 
Convergence  of  the  Distributions  of  Counting  Processes.  Theory  of 
Probability  and  its  Applications  Vol  XXVTII  303-336. 

Kallianpur,  G.  (1980).  Stochastic  Filtering  Theory.  Springer-Verlag. 

Karlin,  S.  and  Taylor,  H.M.  (1975).  A  First  Course  in  Stochastic  Processes 
( Second  Edition).  Academic  Press. 

Kunita,  H.  and  Watanabe,  S.  (1967).  On  Square  Integrable  Martingales.  Nagoya 
Math.  J.  30  209-245. 

Levy,  P.  (1937).  Theorie  de  V addition  des  variables  aleatories.  Paris. 

Lipster,  R.  and  Shiryayev,  A.  (1977).  Statistics  of  Random  Processes  I  General 
Theory.  Springer-Verlag. 

Lipster,  R.  and  Shiryayev,  A.  (1978).  Statistics  of  Random  Processes  I  General 
Theory.  Springer-Verlag. 

Loeve,  M.  (1960).  Probability  Theory.  Van  Nostrand. 

Metivier,  M.  (1977). 

Reele  and  Vektorwertige  Quasimartingale  und  die  Theorie  der 
Stochastischen  Integration.  Lect.  Notes  in  Math.  607,  Springer- 
Verlag. 


Metivier,  M.  (1982).  Semimartingales  a  Course  on  Stochastic  Processes.  Walter 
de  Gruyter  Berlin  NY. 

Meyer,  P.A.  (1967).  Integrates  Stochastiques  (4  exposes).  Seminaire  Proba.  I. 

Universit'e  de  Strasbourg,  Lect.  Notes  in  Math.  39,  Springer-Verlag. 

Meyer,  P.A.  (1969).  Les  Inegalites  de  Burkholder  en  Theorie  des  Martingales 
d”apres  R.  Gundy.  Seminaire  Proba.  Ill,  Universit'e  de  Strasbourg, 
Lect.  Notes  in  Math.  88,  Springer-Verlag. 

Meyer,  P.A.  (1973).  Martingales  and  Stochastic  Integrals  I.  Lect.  Notes  in  Math. 
284,  Springer-Verlag. 

Meyer,  P.A.  (1976).  Un  Cours  Sur  les  Integrales  Stochastiques.  Seminaire  Proba. 

X,  Universit'e  de  Strasbourg,  Lect.  Notes  in  Math.  511,  245-400, 
Springer-Verlag. 

Meyer,  P.A.  (1966).  Probability  and  Potential.  Blaisdell. 

Neveu,  J.  (1975).  Discrete-Parameter  Martingales.  North  Holland. 

Oksendal,  B.  (1986).  Stochastic  Differential  Equations.  Springer-Verlag. 

Rao,  K.M.  (1969).  On  Decomposition  Theorems  of  Meyer.  Math.  Scand.  24  66-78. 

Rogers,  L.C.G.  (1981).  Stochastic  Integrals:  Basic  Theory.  Stochastic  Integrals, 
Proc  LMS  Durham  SymposiumLect.  Notes  in  Math.  851,  Springer- 
Verlag. 

Segall,  A.,  Davis,  M.  and  Kailath,  T.  (1975).  Nonliear  Filtering  with  Counting 
Observations.  IEEE  Transactions  on  IT\ ol  IT-21  No.  2  143. 

Shiryayev,  A.N.  (1982).  Martingales:  Recent  Developments  Results  and 
Applications.  Scand  J.  Statistics  10. 

Shiryayev,  A.N.  (1984).  Probability.  Springer-Verlag. 

Skorokhod,  A.V.  (1956).  Studies  in  the  Theory  of  Random  Processes.  Addison- 
Wesley. 


m 


TO 


Stroock,  D.,  Varadhan,  S.R.S.  (1956).  Multidimensional  Diffusion  Processes. 
Springer-Verlag. 

Stratonovich,  R.  L.  (1964).  A  New  Form  of  Representing  Stochastic  Integrals  and 
Equations.  Vestnik  Moskov.  Univ.  Ser.  I.  Math.  Meh.  1  3-12. 

Van  Schuppen,  J.H.  (1977).  Filtering,  Prediction  and  Smoothing  for  counting 
processes,  a  martingale  approach.  SIAM  J  App.  Math.  32  552-520. 

Van  der  Hoven,  P.C.T.  (1980).  On  Point  Processes.  Mathematical  Centre  Tracts. 

Watanabe,  S.  (1964).  Additive  Functionals  of  Markov  Processes  and  Levy 
Systems.  Jap.  J.  Math.  34  53-79. 

Wiener,  N  (1923).  Differential  Space.  J.  Math,  and  Phys.  Vol  2  131-174. 

Williams,  D  (1979).  Diffusions,  Markov  Processes,  and  Martingales.  Vol  I 
Foundations.  Wiley  &  Sons. 

Williams,  D  (1981).  To  begin  at  the  beginning: . Stochastic  Integrals,  Proc 

LMS  Durham  Symposium  Lect.  Notes  in  Math.  851,  Springer- 
Verlag. 

Wong,  E.  (1973).  Recent  Progress  in  Stochastic  Processes  (Applications  of 
Stochastic  Processes,  1968-72)  IEEE  Trans.  IT . 

Wong,  E.  and  Zakai,  M.  (1965).  On  the  Convergence  of  Ordinary  Integrals  to 
Stochastic  Integrals.  Ann. Statist.  36  5  1560-1564. 

Yor,  M.  (1977).  Sur  les  theories  du  filtrage  et  de  la  prediction.  Seminaire  Proba. 

XI,  Universit'e  de  Strasbourg,  Lect.  Notes  in  Math.  581,  Springer- 
Verlag. 

Yor,  M.  (1977).  Sur  quelques  approximations  d’integrales  stochastiques. 

Seminaire  Proba.  XI,  Universit'e  de  Strasbourg,  Lect.  Notes  in 
Math.  581,  Springer-Verlag. 

Yor,  M.  (1979).  Les  Inegalites  de  Sous-Martingales  Comme  Consequences  de  la 
Relation  de  Domination.  Stochastics  Vol  3  1-15. 


LIST  OF  SYMBOLS 


EV 


ft 


X  :=  (X(t),  tfR+):  Stochastic  process 


X(t,w)  :=  Xt(w):  Process  evaluated  at  (t,w) 


[Xt«A]  :=  {w  :  Xt(w)cA} 


{XcA}  :=  {(t,w)  :  Xt(w)£A,t>0,w£Q} 
XT:  Process  X  stopped  at  time  T. 


V.X:  Stochastic  integral  of  V  relative  to  X. 


[X,X]:  (Optional)  quadratic  variation  of  X. 


<X,X>:  (Previsible)  quadratic  variation  of  X. 


X^  :=  limXgi  (X.)t  :=  X^,  t>0. 

S  — tr- 


AX*  :=  Xt-Xt_,  UR+. 


(X-)n  :=  xn_„  ncZ+;  AXn  :=  Xn  -  Xn.lt  ncZH 


<t(G):  Sigma-algebra  generated  by  the  collection  of  sets,  G. 


Ta:  Restriction  of  the  stopping  time  T  to  the  set  A. 


1A:  Indicator  function  of  the  set  A. 


[(T]j :  Graph  of  the  stopping  time  T. 


[[S,T]]:  A  stochastic  interval,  S  and  T  stopping  times. 


PX  (  °X):  Previsible  (Optional)  projection  of  X. 


Xp:  Dual  previsible  projection  of  X. 


X*(t)  :=  sups<t  |  X(s)  |  :  Supremum  process. 
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G(PT):  cr-algebra  of  previsible  sets. 


G(OT):  <r-algebra  of  optional  sets. 

G(AT):  (7-algebra  of  accessible  sets. 

R+  :  Extended  non-negative  real  line,  [0,oo]. 

Z+  :  Extended  non-negative  integers. 
a~b  :  Minimum  of  the  numbers  a  and  b. 

X  :  Compensator  of  the  process  X. 

Spaces  of  stochastic  processes: 

Mu:  Uniformly  integrable  martingales. 

M0  :=  (Mu)o:  Members  of  Mu  with  m(0)  =  0. 

M|oc  :=  (Mu)|oc;  Local  martingales. 

^O.loc  '  (^o)loc 

K2:  Square  integrable  martingales. 

K2,c:  Continuous  square  integrable  martingales. 

K2,d:  Purely  discontinuous  square  integrable  martingales. 

M[Tj:  Square  integrable  martingales,  continuous  outside  of  [[T]] 
V+:  Increasing  processes. 

BY:  Processes  of  bounded  variation. 

IV+:  Integrable  increasing  processes. 

IV:  Processes  of  integrable  variation. 
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INDEX  OF  DEFINITIONS 


absolutely  continuous,  4.6.23. 
accessible,  2.6.3. 
adapted,  1.3.,  2.3.7. 
admissible,  4.2. 

almost  surely,  relative  to  P,  2.3.1. 
announce,  2.6.1. 

announcing  sequence  for  T,  2.6.1. 
associated,  4.6.11. 

bounded  stopping  time,  1.7.13. 
bounded  variation,  3.2.4. 

Brownian  motion,  6.10.3. 

cadlag,  2.3.12. 

charge  a  stopping  time,  2.7.19. 
class  D,  6.4.1. 

closes  the  martingale,  2.8.8. 
compensated  jump  martingale,  6.4.10.,  6.5.10. 
compensator,  1.7.2.,  1.7.4.,  1.10.2.,  4.6.16. 
complete,  2.2. 

conditional  expectation,  A. 1.1.1. 
continuity,  2.3.1. 

continuous  local  martingale,  6.4.9. 

continuous  part,  3.2.7.,  6.5.13. 

continuous  semi-martingale,  6.11.1. 

counting  process.  3.1.1. 

covariance  process,  1.5. 

covariance  process,  6.6.3. 

cross  quadratic  variation,  1.5.,  3.2.12.,  6.6.5. 

debut,  2.4.3. 

difference  process,  1.5.1. 

diffusion  coefficient,  5.3.3. 

discrete  integral  of  Yr  with  respect  to  X,  1.4. 

discrete  point  process,  1.8.2.,  1.10.1. 

doubly  stochastic  Bernoulli  process,  1.10.4. 

drift  process  B(t)  (drift  rate  f),  5.3.3. 

driven,  1.10.4. 


dual  previsible  projection.  3.1.1.,  4.6.,  4.6.7,  4.7.4,  6.3.3. 
dynamical  system,  1.12.1. 

elementary  stochastic  integral,  6.2.2.,  6.2.4. 
equivalent  norms.  2.8.6. 

evaluating  the  process  at  the  stopping  time  T,  1.3.1.,  2.7. 

evanescent,  1.3.1.,  2.3.2. 

exhaust  the  jumps,  2.7.19. 

exponential  of  a  semi-martingale,  10.11.7. 

filtered  probability  space,  2.2. 

filtration,  1.2.,  2.2. 

filtration  generated  by  X,  1.2.,  2.3.7. 

finite  variation,  3.2.4. 

first  entrance  time  of  X,  2.5.2. 

flow  of  information.  1.7.7. 

foretelling  sequence,  2.6.1. 

graph  of  a  stopping  time,  2.5. 

hitting  time,  2.5.2. 

increasing  process,  1.7.3,  3.2.1. 
independent  increments,  6.10.3. 
indistinguishable,  1.3.1.,  2.3.2. 
innovation  process,  1.12.2. 
innovations  gain,  1.12.2. 
integer  valued  random  measure,  4.7.3. 
integrable,  3.2.1. 
integrable  variation,  3.2.4. 
integrals,  3.2. 

integrated  signal  plus  noise.,  5.3.3. 
integration  by  parts,  1.8. 
intensity,  1.10.1.,  4.6.24. 
internal  history,  1.2. 

Ito  diffusion,  10.11.8. 

jump  at  a  stopping  time,  2.7.19. 
jump  measure,  4.7.8. 
jump  process,  4.7.8. 


kernel,  2.7.12. 


Lp  -  bounded,  2.8.5. 

Lp  martingale,  2.8.5. 
local  characteristics,  6.10.1. 
local  integrable  variation,  6.3.1. 
local  martingale,  5.1.1. 
localization,  5.1.8. 
localized  class,  6.1.2. 
localizing  sequence.  5.1.2. 
locally  integrable,  6.3.1. 
local  Lp  -martingale,  5.1.1. 

mark  space,  3.1.1. 
marked  point  process,  3.1.1.,  4.7.3. 
martingale,  1.6.1..  2.8.,  6.7 
martingale  problem,  10.11.8. 
martingale  compensator,  1.10.2. 
martingale  transforms,  1.7.8. 
martingales,  2.8.,  6.7 

measurability  relative  to  the  filtration,  2.3.10. 
measurable.  2.3.,  2.3.9. 

measurable  random  variable  or  function,  1.3. 
measures  generated  by  increasing  processes,  4.2. 
method  of  localization,  6.1.2. 
modifications,  2.3.2. 

n-debut,  2.7.18. 
natural  filtration,  1.2. 
non-explosive,  3.1.1. 
nonanticipating.  2.3.7. 

observable,  1.3..  1.11.,  2.3.7. 
optional,  2.7.9. 
optional  projections,  4.5. 
orthogonal,  6.4.6. 

P-null  set,  2.2. 

packet  radio  networks,  1.10.2. 
path  segments,  4. 1. 
point  process,  3.1. 


predictable,  2.6.1. 
previsible,  1.3..  2.6.1 

previsible  compensator,  1.7.4..  3.1.1..  4.6.16. 

previsible  projection,  4.3. 

previsible  quadratic  variation,  6.6.1. 

prior  to  T,  denoted  F(T),  2.4.4. 

probability  space,  1.2. 

process  stopped  at  time  T,  2.7.3. 

progressive,  2.3.10. 

progressive  measurability,  2.3.10. 

purely  discontinuous,  3.2.7.,  6.4.10. 

quadratic  variation,  1.5.,  6.6.5. 
quasi-left  continuous  filtration,  2.7.4. 
quasi-left  continuous  process,  2.7.24. 
queue,  5.3.6. 

random  measure,  4.7.3. 

random  measure  of  a  point  process,  4.7.3. 

random  set,  1.3.1,  2.3.2.,  4.5 

random  shift,  2.7.3. 

random  variable,  1.3. 

raw  increasing  process,  3.2.3. 

reduces,  6.5.17. 

reference  family,  2.3.8. 

restriction,  2.6.4. 

Riemann-Stieltjes,  3.2.13. 

right  continuity,  2.3.1. 

right  continuous,  2.2. 

right  continuous  modification,  2.8.4. 

saltus  measure,  4.7.8. 
semi-martingales,  1.11,  1.7.6,  5.2.1,  6.5.1 
simple  point  process,  3.1.1. 
single  filtration,  2.3.8. 

Skorokhod  processes,  2.3.12. 
solution,  6.11.7. 

special  semi-martingale,  1.7.6.,  6.5.3. 
square  brackets,  1.5.,  3.2.12. 
square  integrable,  2.8.5. 
square  integrable  martingales,  6.4.5. 


stable,  6.1.2. 
state  space.  2.3.6. 

stationary  independent  increments,  6.10.3. 

stochastic  differential  equation.  6.11.7. 

stochastic  integral,  3.2.5.,  6.7.4.,  6.7.9.,  6.8.1.,  6.9.1 

stochastic  interval,  1.7.11.,  2.5. 

stochastic  process,  1.3.,  2.3. 

stopped  at  time  T,  1.3.1. 

stopping  time  (optional  time),  1.2.1.,  2.4.1. 

submartingale,  2.8.1. 

supermartingale,  1.6.1.,  2.8.1. 

terminal  random  variable,  2.8.8. 
thin,  4.5. 

totally  inaccessible,  2.6.3. 
trace  cr-algebra,  1.3. 
trajectories,  2.3.1. 
transform,  1.4. 

transition  probability,  A. 1.2.1. 
transition  measure.  A. 1.2.4. 
translation,  2.7.3. 
trivial  filtration,  1.3.,  3.1. 
triviale  stochastic  integral,  6.2.2. 
truncation,  2.7.3. 

uniformly  integrable,  2.8.8. 
usual  conditions,  2.2. 

variance  process,  1.5. 
variation,  6.3. 
versions,  4.1. 

Wiener  process,  6.10.3. 


zero  stopping  time,  2.7.11. 


Appendix  A 


A  1.  Odds  and  Ends,  including  Fubini’s  Theorem. 

A  1.1.  Some  Useful  Definitions  and  Results: 

A  1.1.1.  Conditional  Expectation: 

Let  (f2,H,P)  be  a  probability  space  and  G  be  a  sub  tr-algebra  of  H.  Let  X  be  a  P- 
integrable  random  variable  and  define  the  measure  /i  on  G  by  setting 

A)  :=  /X(w)P(dw|  =  /XdP, 


Then  //  is  a  finite  measure  on  G  which  is  absolutely  continuous  relative  to  the 
restriction  of  P  to  G.  The  Radon-Nikodym  derivative  of  n  with  respect  to  this 
restriction  is  called  the  conditional  expectation  of  X  given  G.  Therefore, 
E(X|G)  is  an  a.s.P  unique  G-measurable  integrable  random  variable  Z  which  is 
characterized  by 

/  ZdP  =  /XdP,  (1) 

A  A 

for  all  A  in  G,  since  P  and  its  restriction  agree  on  G. 

The  following  is  a  list  of  some  of  the  more  important  properties  of  conditional 
expectation.  These  properties  together  with  equation  (1)  are  constantly  (  and 
silently  )  used  in  Chapters  1  through  6. 

Let  X  and  Y  be  P-integrable  random  variable  and  a,b  real  numbers.  Then 

(i)  E(  aX  +  bY  |  G)  =  aE(  X  |  G)  +  bE(  Y  |  G),  a.s.P. 

(ii)  If  Y  is  G-measurable  and  XY  is  P-integrable,  then 
E(XY  |  G)  =  YE(X  |  G),  a.s.P. 

(iii)  If  J  is  a  sub  (7-algebra  of  G,  then 
E(X  |  J)  =  E(E(X  |  G)  |  J),  a.s.P. 

(iv)  E(1  |  G)  =  1. 
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(v)  If  X>0  a.s.P,  then  E(X  |  G)  >  0. 

(vi)  If  XneLj(P),  for  all  ncZ+,  and  Xn  — ►  X,  in  L,(P),  then 
E(Xn  |  G)  — *  E(X  |  G),  in  L,(P). 

( vii)  If  XneL1(P)>  for  all  neZ+,  Xn|X,  a.s.P,  and  XeL^P),  then 
E(Xn  |  G)  —  E(X  |  G),  a.s.P. 

(viii)  If  h:Rj  — ►  is  convex,  and  h(X)eL1(P),  then 

h(E(X  |  G))  <  E(h(X)  |  G),  a.s.P. 

Remark:  Properties  (ii)  and  (iv)  combine  to  yield  Y  =  E(Y  |  G),  a.s.P,  when  Y  is 
G-measurable  and  P-integrable. 

A  1.1.2.  Skorokhod  Processes 

A  function  is  called  Skorokhod  if  it  is  right  continuous  with  left  limits  at  each 
point  in  its  domain.  Some  basic  results  on  such  functions  can  be  found  in  Bil¬ 
lingsley  [1968].  Billingsley  considers  real-valued  Skorokhod  functions  defined  on 
compact  intervals.  In  Chapters  2  to  6  in  the  body  of  the  present  note,  the  usual 
domain  for  functions  (  as  paths  of  stochastic  processes  )  is  the  interval  [0,oo). 
The  results  from  Billingsley  that  we  quote  here  carry  over  in  an  obvious  way  to 
this  domain.  For  this  purpose,  let  f  be  a  Skorokhod  function  defined  on  [0,oe). 
Then 

(i)  f  has  at  most  a  countable  number  of  discontinuities; 

(ii)  On  any  compact  interval,  f  has  at  most  a  finite  number 
of  discontinuities  where  the  magnitude  of  the  corresponding 
jumps  exceed  a  specified  fixed  positive  number; 

(iii)  f  is  bounded  on  compact  intervals. 


A  1.2.  Fubini’s  Theorem: 

A  1.2.1.  Definition  (Transition  Probability):  Let  (vw,  wfQ)  be  a  family  of  pro¬ 
bability  measures  on  the  measure  space  (E,G).  Let  (Q,H)  be  a  measure  space  If 
the  mapping  w  — »  vw(B)  is  H-measurable,  for  each  B  in  G,  then  the  family 
(vw,  wcfl)  is  called  a  transition  probability  from  (0,H)  to  (E,G). 


’mmmm 


m 


A  1.2.2.  Theorem  (  Fubini  ): 

Let  U  =  Exfl,  V=  GxH,  and  f  be  a  real  valued,  V-measurable  function  (  a 
random  variable  on  (U,V)). 


(i)  Then,  for  each  well, 

(x  — ►  f(x,w))  is  G-measurable 
and,  for  each  xcE, 

(w  — ♦  f(x,w))  is  H-measurable. 

(ii)  Further,  let  P  be  a  probability  measure  on  (Q,H)  and 
(vw,  wefl)  a  transition  probability  from  (Q,H)  into  (E,G). 
Then  there  exists  a  unique  probability  measure,  ji,  on  (U,V) 
such  that 

/r(CxD)  =  /  VW(C)  P(dw) 

D 

for  all  CcG  and  DcH. 

(iii)  If  f  is  non-negative,  then 


and 


(w  — *■  J  f(x,w)  vw(dx))  is  H-measurable 
E 


Jfdfi  =  /  /f(x,w)  vw(dx)P(dw). 
u  n  E 


(2) 


If  fcLjf/i),  then  equation  (2)  holds  and  (x— ►f(x,\v))cL](vw), 
a.s.P. 

1.2.3.  Remark:  In  the  special  case  that  vw  is  independent  of  w,  n  is  called  the 
product  measure. 

1.2.4.  Remark:  We  have  stated  Fubini’s  Theorem  in  terms  of  transition  proba¬ 
bilities.  It  holds  also,  and  will  be  applied,  when  the  indexed  family  of  probability 
measures  in  the  definition  of  transition  probability  is  replaced  by  an  indexed  fam¬ 
ily  of  (T-finite  measures,  satisfying  the  measurability  condition  of  the  definition. 
The  result  is  called  a  transition  measure. 
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Appendix  B 


B  1.  Lebesgue-Stieltjes  Stochastic  Integrals: 


B  1.1.  On  the  Existence  of  a  Lebesgue-Stieltjes  Stochastic  Integral: 

We  will  now  give  a  detailed  explanation  of  the  the  existence  of  the  stochastic 
Lebesgue-Stieltjes  integral  induced  by  an  increasing  stochastic  process,  A. 

Let  B=B([0,oo])  be  the  tr  algebra  of  subsets  generated  by  intervals  of  the  form 
(a,b],  a  and  b  non-negative.  Let  C  — ►  v(C,w),  CeB,  be  the  measure  on  B  induced 
by  the  right  continuous,  increasing  function,  A  by  setting  v((a,b],w)  :=  A(b,w)  - 
A(a,w),  for  all  non-negative  a  and  b  (a<b)  and  each  wefl. 

Let  V  :=  BxH  be  the  product  algebra  on  U  :=  [0,oo)XfL  Since  t— *-A(t,w)  is 
increasing,  A  is  V-measurable.  Therefore,  the  mapping  w  — ►  v(C,w)  is  H- 
measurable  for  each  C  in  B  . 


By  Fubini’s  Theorem,  given  the  family  of  tr-finite  transition  measures  {v(.,w)  : 
wefl  }  and  the  probability  measure,  P,  on  H,  there  corresponds  a  unique  <r-finite 
measure,  fi,  from  V  into  [0,oo],  such  that 

M  CXD)  =  /v(C,w)P(dw) 

D 

for  all  CcB  and  DeH,  and  for  any  p-integrable  real  function,  f,  defined  on  U,  the 
mapping 

w  — »  f  f(s,w)v(ds,w) 
io.co) 


is  H-measurable,  and 

ffdp  =  f  f  f(s,w)v(ds,w)P(dw). 

u  n[o,oo) 


°c  , 

Let  X  be  any  V-measurable  process  such  that  E J  |  X(s)  |  dA(s)  <  oo,  where  dA  ■ 

o 

denotes  the  integration  relative  to  the  measure  v. 

Then 

; 

OO  ( 

JXdfi  =  E  /X( s)dA(s)  ; 

U  0  ! 
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and  Fubini's  theorem  states  that  the  pathwise  Lebesgue  Stieltjes  integral 

00 

PC.A)to(w)  =  JX(s,w)  dA(s,w) 
o 

exists,  a.s.P.  The  process  ((X.A)t,t>0)  is  then  defined  by  setting 
(X.A)t  :=  (ljo  tiX-A)^,  for  each  t>0. 


B  1.2.  Monotone  Class  Theorem: 

B  1.2.1.  Theorem:  Let  O  be  a  set  and  C  a  collection  of  subsets  of  O  which  is 
closed  under  finite  intersection. 

1)  Let  S(C)  be  the  smallest  collection  of  subsets  of  O  which  contains  C  and 
satisfies 

a)  OeS(C); 

b)  If  A,BcS(C),  with  A  a  subset  of  B,  then  B-AcS(C); 

c)  S(C)  is  closed  under  countable  unions  of  increasing 
sequences  of  its  members. 

Then  S(C)  is  the  smallest  <r  algebra  containing  C. 


i 

1 


2)  Let  H*  be  a  vector  space  of  real-valued  functions  defined  on  the  set  O  and 
satisfying 

a)  IcH*  and  if  A(C  then  lAeH*; 

b)  If  (fn,n  >  0)  is  an  increasing  sequence  of  nonnegative 
members  of  H*,  with  bounded  supremum,  then  sup{fn:n>0} 
is  also  a  member  of  H*. 

Then  H*  contains  all  bounded  real-valued  functions,  defined  on  O,  which  are 
measurable  relative  to  the  a  algebra  generated  by  C. 
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